The Complexity of Gradient Descent: CLS = PPAD ∩ PLS
We consider the problem of computing a Gradient Descent solution of a continuously differentiable function f on a bounded convex domain, i.e., finding a "point where Gradient Descent terminates". Equivalently, this corresponds to computing a so-called KKT (Karush-Kuhn-Tucker) stationary point of f.
We study this problem in the "white box" model, where the function f is given in the input, e.g., as an arithmetic circuit. By some standard arguments, it follows that the problem lies in the intersection of the complexity classes PPAD and PLS. PPAD is mostly known for capturing the complexity of computing a Nash equilibrium, while PLS has been successful in characterizing the complexity of various local optimization problems, such as Local-Max-Cut.
We prove that the Gradient Descent problem is complete for PPAD ∩ PLS. In particular, this implies that the class CLS ("Continuous Local Search") which was previously thought to be a strict subset of PPAD ∩ PLS, is in fact equal to it.