no code implementations • 30 May 2024 • Elizabeth Collins-Woodfin, Inbar Seroussi, Begoña García Malaxechebarría, Andrew W. Mackenzie, Elliot Paquette, Courtney Paquette
For noiseless targets, we further demonstrate that the AdaGrad-Norm learning rate converges to a deterministic constant inversely proportional to the average eigenvalue of the data covariance matrix, and identify a phase transition when the covariance density of eigenvalues follows a power law distribution.
no code implementations • 17 Aug 2023 • Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, Inbar Seroussi
In addition to the deterministic equivalent, we introduce an SDE with a simplified diffusion coefficient (homogenized SGD) which allows us to analyze the dynamics of general statistics of SGD iterates.
no code implementations • 19 Jan 2021 • Elizabeth Collins-Woodfin
Previous research has noted that, at low temperature, this overlap exhibits dramatically different behavior in the presence of an external field as compared to the model with no external field.
Probability Mathematical Physics Mathematical Physics