1 code implementation • 5 Feb 2024 • Xingyu Qu, Samuel Horvath
Recent studies suggest that with sufficiently wide models, most SGD solutions can, up to permutation, converge into the same basin.
no code implementations • 15 Sep 2022 • Xingyu Qu, Diyang Li, Xiaohan Zhao, Bin Gu
The SPL regime involves a self-paced regularizer and a gradually increasing age parameter, which plays a key role in SPL but where to optimally terminate this process is still non-trivial to determine.