Open Access. Powered by Scholars. Published by Universities.®

Dynamical Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Machine learning

2021

Articles 1 - 1 of 1

Full-Text Articles in Dynamical Systems

Stationary Probability Distributions Of Stochastic Gradient Descent And The Success And Failure Of The Diffusion Approximation, William Joseph Mccann May 2021

Stationary Probability Distributions Of Stochastic Gradient Descent And The Success And Failure Of The Diffusion Approximation, William Joseph Mccann

Theses

In this thesis, Stochastic Gradient Descent (SGD), an optimization method originally popular due to its computational efficiency, is analyzed using Markov chain methods. We compute both numerically, and in some cases analytically, the stationary probability distributions (invariant measures) for the SGD Markov operator over all step sizes or learning rates. The stationary probability distributions provide insight into how the long-time behavior of SGD samples the objective function minimum.

A key focus of this thesis is to provide a systematic study in one dimension comparing the exact SGD stationary distributions to the Fokker-Planck diffusion approximation equations —which are commonly used in …