Saddle Avoidance, Asymptotic Normality, and Exponential Acceleration in Nonsmooth Optimization
Optimization-based algorithms are the foundation for empirically successful methods in modern fields, such as artificial intelligence and data science. Although classical optimization theory provides guarantees for functions with smoothness or convexity, a significant portion of modern problems do not possess any of these. Despite the worst-case examples where efficient algorithms are unavailable, typical nonsmoothness arises with a "partly smooth" structure, meaning that they are well-behaved relative to a smooth "active manifold." This thesis develops and analyzes first-order algorithms based on the aforementioned nonsmooth structure. We first develop two regularity conditions describing how subgradients interact with active manifolds and then show that they hold for a broad and generic class of functions. With these cornerstones, we demonstrate that when randomly perturbed or equipped with stochastic noise, subgradient methods only converge to minimizers of generic, Clarke regular semialgebraic problems. When convergence to a certain minimizer is known, we demonstrate that stochastic (projected) subgradient methods have asymptotic normality, making them asymptotically optimal algorithms in the locally minimax sense of Hájek and Le Cam. These findings culminate with a new first-order algorithm—NTDescent—which exhibits local nearly linear convergence on typical nonsmooth functions with quadratic growth. The convergence rate of NTDescent depends only on the function's intrinsic quantities but not the problem's underlying dimension.