Variable metric proximal stochastic variance reduced gradient methods for nonconvex nonsmooth optimization
Citation
Source Title
ISSN
Faculty
School
Collection
Abstract
We study the problem of minimizing the sum of two functions. The first function is the average of a large number of nonconvex component functions and the second function is a convex (possibly nonsmooth) function that admits a simple proximal mapping. With a diagonal Barzilai-Borwein stepsize for updating the metric, we propose a variable metric proximal stochastic variance reduced gradient method in the mini-batch setting, named VM-SVRG. It is proved that VM-SVRG converges sublinearly to a stationary point in expectation. We further suggest a variant of VM-SVRG to achieve linear convergence rate in expectation for nonconvex problems satisfying the proximal Polyak-Lojasiewicz inequality. The complexity of VM-SVRG is lower than that of the proximal gradient method and proximal stochastic gradient method, and is the same as the proximal stochastic variance reduced gradient method. Numerical experiments are conducted on standard data sets. Comparisons with other advanced proximal stochastic gradient methods show the efficiency of the proposed method.
Related items
Showing items related by title, author, creator and subject.
-
Yu, T.T.; Liu, X.W.; Dai, Y.H.; Sun, Jie (2022)Many machine learning problems can be formulated as minimizing the sum of a function and a non-smooth regularization term. Proximal stochastic gradient methods are popular for solving such composite optimization problems. ...
-
Yu, T.; Liu, X.W.; Dai, Y.H.; Sun, Jie (2021)We consider the problem of minimizing the sum of an average of a large number of smooth convex component functions and a possibly nonsmooth convex function that admits a simple proximal mapping. This class of problems ...
-
Liu, Chunmin (2008)The optimization problems involving stochastic systems are often encountered in financial systems, networks design and routing, supply-chain management, actuarial science, telecommunications systems, statistical pattern ...