Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Physical Sciences and Mathematics
Batched Stochastic Gradient Descent With Weighted Sampling, Deanna Needell, Rachel Ward
Batched Stochastic Gradient Descent With Weighted Sampling, Deanna Needell, Rachel Ward
CMC Faculty Publications and Research
We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the convergence rate is provably possible compared to either batched sampling or weighted sampling alone. We propose several computationally efficient schemes to approximate the optimal weights, and compute proposed sampling distributions explicitly for the least squares and hinge loss problems. We show both analytically and experimentally that substantial gains can be obtained
Stochastic Gradient Descent, Weighted Sampling, And The Randomized Kaczmarz Algorithm, Deanna Needell, Nathan Srebro, Rachel Ward
Stochastic Gradient Descent, Weighted Sampling, And The Randomized Kaczmarz Algorithm, Deanna Needell, Nathan Srebro, Rachel Ward
CMC Faculty Publications and Research
We obtain an improved finite-sample guarantee on the linear convergence of stochastic gradient descent for smooth and strongly convex objectives, improving from a quadratic dependence on the conditioning (L/µ) 2 (where L is a bound on the smoothness and µ on the strong convexity) to a linear dependence on L/µ. Furthermore, we show how reweighting the sampling distribution (i.e. importance sampling) is necessary in order to further improve convergence, and obtain a linear dependence in the average smoothness, dominating previous results. We also discuss importance sampling for SGD more broadly and show how it can improve convergence also in other …