CMStatistics 2021: Start Registration
View Submission - CMStatistics
B0289
Title: Stochastic optimization with momentum: Convergence, fluctuations, and traps avoidance Authors:  Anas Barakat - Telecom Paris, IPParis (France) [presenting]
Pascal Bianchi - Telecom Paris-IPParis-France (France)
Walid Hachem - CNRS - Gustave Eiffel University - Gaspard Monge computer science laboratory (France)
Sholom Schechtman - CNRS - Gustave Eiffel University - Gaspard Monge computer science laboratory (France)
Abstract: A general stochastic optimization procedure is presented by unifying several variants of the stochastic gradient descent, such as, among others, the stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient algorithm (S-NAG), and the widely used Adam algorithm. The algorithm is seen as a noisy Euler discretization of a recently introduced non-autonomous ordinary differential equation, which is analyzed in depth. Assuming that the objective function is non-convex and differentiable, the stability and the almost sure convergence of the iterates to the set of critical points are established. A noteworthy special case is the convergence proof of S-NAG in a non-convex setting. Under some assumptions, the convergence rate is provided in the form of a Central Limit Theorem. Finally, the non-convergence of the algorithm to undesired critical points, such as local maxima or saddle points, is established. Here, the main ingredient is a new avoidance of traps result for non-autonomous settings, which is of independent interest.