Search Results for author: Francesca Mignacco

Found 9 papers, 3 papers with code

Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers

1 code implementation • 24 May 2024 • Lorenzo Tiberi, Francesca Mignacco, Kazuki Irie, Haim Sompolinsky

Our theory shows that the predictor statistics are expressed as a sum of independent kernels, each one pairing different 'attention paths', defined as information pathways through different attention heads across layers.

Paper
Code

Nonlinear classification of neural manifolds with contextual information

no code implementations • 10 May 2024 • Francesca Mignacco, Chi-Ning Chou, SueYeon Chung

Understanding how neural systems efficiently process information through distributed representations is a fundamental challenge at the interface of neuroscience and machine learning.

Classification

Paper
Add Code

Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization

no code implementations • 10 Feb 2023 • Ravi Srinivasan, Francesca Mignacco, Martino Sorbaro, Maria Refinetti, Avi Cooper, Gabriel Kreiman, Giorgia Dellaferrera

"Forward-only" algorithms, which train neural networks while avoiding a backward pass, have recently gained attention as a way of solving the biologically unrealistic aspects of backpropagation.

Paper
Add Code

Rigorous dynamical mean field theory for stochastic gradient descent methods

1 code implementation • 12 Oct 2022 • Cedric Gerbelot, Emanuele Troiani, Francesca Mignacco, Florent Krzakala, Lenka Zdeborova

We prove closed-form equations for the exact high-dimensional asymptotics of a family of first order gradient-based methods, learning an estimator (e. g. M-estimator, shallow neural network, ...) from observations on Gaussian data with empirical risk minimization.

Paper
Code

Learning curves for the multi-class teacher-student perceptron

1 code implementation • 22 Mar 2022 • Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

For Gaussian teacher weights, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality.

Binary Classification Learning Theory +1

Paper
Code

The effective noise of Stochastic Gradient Descent

no code implementations • 20 Dec 2021 • Francesca Mignacco, Pierfrancesco Urbani

In the under-parametrized regime, where the final training error is positive, the SGD dynamics reaches a stationary state and we define an effective temperature from the fluctuation-dissipation theorem, computed from dynamical mean-field theory.

Paper
Add Code

Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

no code implementations • 8 Mar 2021 • Francesca Mignacco, Pierfrancesco Urbani, Lenka Zdeborová

In this paper we investigate how gradient-based algorithms such as gradient descent, (multi-pass) stochastic gradient descent, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity.

Navigate Retrieval

Paper
Add Code

Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

no code implementations • NeurIPS 2020 • Francesca Mignacco, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

We define a particular stochastic process for which SGD can be extended to a continuous-time limit that we call stochastic gradient flow.

General Classification

Paper
Add Code

The role of regularization in classification of high-dimensional noisy Gaussian mixture

no code implementations • ICML 2020 • Francesca Mignacco, Florent Krzakala, Yue M. Lu, Lenka Zdeborová

We also illustrate the interpolation peak at low regularization, and analyze the role of the respective sizes of the two clusters.

General Classification regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.