1 code implementation • 5 Feb 2024 • Simone Bombari, Marco Mondelli
Understanding the reasons behind the exceptional success of transformers requires a better analysis of why attention layers are suitable for NLP tasks.
1 code implementation • 20 May 2023 • Simone Bombari, Marco Mondelli
In this paper, we consider spurious features that are uncorrelated with the learning task, and we provide a precise characterization of how they are memorized via two separate terms: (i) the stability of the model with respect to individual training samples, and (ii) the feature alignment between the spurious feature and the full sample.
1 code implementation • 3 Feb 2023 • Simone Bombari, Shayan Kiyani, Marco Mondelli
However, this "universal" law provides only a necessary condition for robustness, and it is unable to discriminate between models.
no code implementations • 20 May 2022 • Simone Bombari, Mohammad Hossein Amani, Marco Mondelli
The Neural Tangent Kernel (NTK) has emerged as a powerful tool to provide memorization, optimization and generalization guarantees in deep neural networks.
no code implementations • 17 May 2022 • Mohammad Hossein Amani, Simone Bombari, Marco Mondelli, Rattana Pukdee, Stefano Rini
In this paper, we study the compression of a target two-layer neural network with N nodes into a compressed network with M<N nodes.
no code implementations • 30 Mar 2022 • Simone Bombari, Alessandro Achille, Zijian Wang, Yu-Xiang Wang, Yusheng Xie, Kunwar Yashraj Singh, Srikar Appalaraju, Vijay Mahadevan, Stefano Soatto
While bounding general memorization can have detrimental effects on the performance of a trained model, bounding RM does not prevent effective learning.