no code implementations • 7 Feb 2024 • Daniel Beaglehole, Ioannis Mitliagkas, Atish Agarwala
Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA).
1 code implementation • 22 Aug 2023 • Charles Guille-Escuret, Pierre-André Noël, Ioannis Mitliagkas, David Vazquez, Joao Monteiro
Our findings reveal that while these methods excel in detecting unknown classes, their performance is inconsistent when encountering other types of distribution shifts.
1 code implementation • 17 Jul 2023 • Hiroki Naganuma, Ryuichiro Hataya, Ioannis Mitliagkas
In the realm of out-of-distribution (OOD) generalization tasks, fine-tuning pre-trained models has become a prevalent strategy.
1 code implementation • 20 Jun 2023 • Charles Guille-Escuret, Hiroki Naganuma, Kilian Fatras, Ioannis Mitliagkas
Understanding the optimization dynamics of neural networks is necessary for closing the gap between theory and practice.
1 code implementation • 14 Apr 2023 • Mehrnaz Mofakhami, Ioannis Mitliagkas, Gauthier Gidel
In this work, we instead assume that the data distribution is Lipschitz continuous with respect to the model's predictions, a more natural assumption for performative systems.
1 code implementation • 26 Nov 2022 • Sébastien Lachapelle, Tristan Deleu, Divyat Mahajan, Ioannis Mitliagkas, Yoshua Bengio, Simon Lacoste-Julien, Quentin Bertrand
Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited.
1 code implementation • 15 Nov 2022 • Hiroki Naganuma, Kartik Ahuja, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato, Ioannis Mitliagkas
Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution.
1 code implementation • 3 Nov 2022 • Divyat Mahajan, Ioannis Mitliagkas, Brady Neal, Vasilis Syrgkanis
We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation.
1 code implementation • 6 Oct 2022 • Adam Ibrahim, Charles Guille-Escuret, Ioannis Mitliagkas, Irina Rish, David Krueger, Pouya Bashivan
Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training.
no code implementations • 3 Oct 2022 • Tiago Salvador, Kilian Fatras, Ioannis Mitliagkas, Adam Oberman
In this work, we consider the Partial Domain Adaptation (PDA) variant, where we have extra source classes not present in the target domain.
no code implementations • 29 Sep 2022 • Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, Murat A. Erdogdu
We further demonstrate that, SGD-trained ReLU NNs can learn a single-index target of the form $y=f(\langle\boldsymbol{u},\boldsymbol{x}\rangle) + \epsilon$ by recovering the principal direction, with a sample complexity linear in $d$ (up to log factors), where $f$ is a monotonic function with at most polynomial growth, and $\epsilon$ is the noise.
no code implementations • 22 Jun 2022 • Kilian Fatras, Hiroki Naganuma, Ioannis Mitliagkas
It is common in computer vision to be confronted with domain shift: images which have the same class but different acquisition conditions.
3 code implementations • 12 Jun 2022 • Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer
This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm.
1 code implementation • 10 Apr 2022 • Kartik Ahuja, Divyat Mahajan, Vasilis Syrgkanis, Ioannis Mitliagkas
In this work, we depart from these assumptions and ask: a) How can we get disentanglement when the auxiliary information does not provide conditional independence over the factors of variation?
no code implementations • 28 Oct 2021 • Ryan D'Orazio, Nicolas Loizou, Issam Laradji, Ioannis Mitliagkas
We investigate the convergence of stochastic mirror descent (SMD) under interpolation in relatively smooth and smooth convex optimization.
no code implementations • 20 Oct 2021 • Manuela Girotti, Ioannis Mitliagkas, Gauthier Gidel
We theoretically analyze the Feedback Alignment (FA) algorithm, an efficient alternative to backpropagation for training neural networks.
no code implementations • NeurIPS Workshop DLDE 2021 • Alexia Jolicoeur-Martineau, Ke Li, Rémi Piché-Taillefer, Tal Kachman, Ioannis Mitliagkas
Score-based (denoising diffusion) generative models have recently gained a lot of success in generating realistic and diverse data.
1 code implementation • NeurIPS 2021 • Nicolas Loizou, Hugo Berard, Gauthier Gidel, Ioannis Mitliagkas, Simon Lacoste-Julien
Two of the most prominent algorithms for solving unconstrained smooth games are the classical stochastic gradient descent-ascent (SGDA) and the recently introduced stochastic consensus optimization (SCO) [Mescheder et al., 2017].
2 code implementations • NeurIPS 2021 • Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish
To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD.
1 code implementation • 28 May 2021 • Alexia Jolicoeur-Martineau, Ke Li, Rémi Piché-Taillefer, Tal Kachman, Ioannis Mitliagkas
For high-resolution images, our method leads to significantly higher quality samples than all other methods tested.
Ranked #8 on Image Generation on CIFAR-10 (Inception score metric)
no code implementations • 10 Dec 2020 • Charles Guille-Escuret, Baptiste Goujaud, Manuela Girotti, Ioannis Mitliagkas
Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous.
1 code implementation • 26 Oct 2020 • Reyhane Askari Hemmat, Amartya Mitra, Guillaume Lajoie, Ioannis Mitliagkas
A central obstacle in the optimization of such games is the rotational dynamics that hinder their convergence.
1 code implementation • NeurIPS 2020 • Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal, Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Daniel M. Roy
A large volume of work aims to close this gap, primarily by developing bounds on generalization error, optimization error, and excess risk.
1 code implementation • ICLR 2021 • Alexia Jolicoeur-Martineau, Rémi Piché-Taillefer, Rémi Tachet des Combes, Ioannis Mitliagkas
Denoising Score Matching with Annealed Langevin Sampling (DSM-ALS) has recently found success in generative modeling.
Ranked #54 on Image Generation on CIFAR-10
no code implementations • ICML 2020 • Nicolas Loizou, Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent, Simon Lacoste-Julien, Ioannis Mitliagkas
The success of adversarial formulations in machine learning has brought renewed motivation for smooth games.
no code implementations • 2 Jan 2020 • Waïss Azizian, Damien Scieur, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel
Using this perspective, we propose an optimal algorithm for bilinear games.
2 code implementations • 3 Nov 2019 • Isabela Albuquerque, João Monteiro, Mohammad Darvishi, Tiago H. Falk, Ioannis Mitliagkas
In this work, we tackle such problem by focusing on domain generalization: a formalization where the data generating process at test time may yield samples from never-before-seen domains (distributions).
Ranked #61 on Domain Generalization on PACS
2 code implementations • 15 Oct 2019 • Alexia Jolicoeur-Martineau, Ioannis Mitliagkas
We present a unifying framework of expected margin maximization and show that a wide range of gradient-penalized GANs (e. g., Wasserstein, Standard, Least-Squares, and Hinge GANs) can be derived from this framework.
Ranked #131 on Image Generation on CIFAR-10
no code implementations • ICML 2020 • Adam Ibrahim, Waïss Azizian, Gauthier Gidel, Ioannis Mitliagkas
In this work, we approach the question of fundamental iteration complexity by providing lower bounds to complement the linear (i. e. geometric) upper bounds observed in the literature on a wide class of problems.
no code implementations • 13 Jun 2019 • Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel
We provide new analyses of the EG's local and global convergence properties and use is to get a tighter global convergence rate for OG and CO. Our analysis covers the whole range of settings between bilinear and strongly monotone games.
1 code implementation • NeurIPS 2019 • Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux
While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting.
no code implementations • 26 May 2019 • Alex Lamb, Jonathan Binas, Anirudh Goyal, Sandeep Subramanian, Ioannis Mitliagkas, Denis Kazakov, Yoshua Bengio, Michael C. Mozer
Machine learning promises methods that generalize well from finite labeled data.
no code implementations • ICML Workshop Deep_Phenomen 2019 • Brady Neal, Ioannis Mitliagkas
There is significant recent evidence in supervised learning that, in the over-parametrized setting, wider networks achieve better test error.
1 code implementation • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio
Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes.
no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar
Machine learning (ML) techniques are enjoying rapidly increasing adoption.
1 code implementation • ICLR 2019 • Isabela Albuquerque, João Monteiro, Thang Doan, Breandan Considine, Tiago Falk, Ioannis Mitliagkas
Recent literature has demonstrated promising results for training Generative Adversarial Networks by employing a set of discriminators, in contrast to the traditional game involving one generator against a single adversary.
no code implementations • 19 Oct 2018 • Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas
The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve.
1 code implementation • ICLR 2019 • Devansh Arpit, Bhargav Kanuparthi, Giancarlo Kerg, Nan Rosemary Ke, Ioannis Mitliagkas, Yoshua Bengio
This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number of steps.
1 code implementation • 12 Jul 2018 • Gauthier Gidel, Reyhane Askari Hemmat, Mohammad Pezeshki, Remi Lepriol, Gabriel Huang, Simon Lacoste-Julien, Ioannis Mitliagkas
Games generalize the single-objective optimization paradigm by introducing different objective functions for different players.
12 code implementations • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio
Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples.
Ranked #18 on Image Classification on OmniBenchmark
1 code implementation • ICLR 2019 • Alex Lamb, Jonathan Binas, Anirudh Goyal, Dmitriy Serdyuk, Sandeep Subramanian, Ioannis Mitliagkas, Yoshua Bengio
Deep networks have achieved impressive results across a variety of important tasks.
no code implementations • ICLR 2018 • Brady Neal, Alex Lamb, Sherjil Ozair, Devon Hjelm, Aaron Courville, Yoshua Bengio, Ioannis Mitliagkas
One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler generation tasks.
no code implementations • 17 Aug 2017 • Thorsten Kurth, Jian Zhang, Nadathur Satish, Ioannis Mitliagkas, Evan Racah, Mostofa Ali Patwary, Tareq Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan, Prabhat, Pradeep Dubey
This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures.
no code implementations • ICML 2017 • Ioannis Mitliagkas, Lester Mackey
The pairwise influence matrix of Dobrushin has long been used as an analytical tool to bound the rate of convergence of Gibbs sampling.
2 code implementations • 10 Jul 2017 • Christopher De Sa, Bryan He, Ioannis Mitliagkas, Christopher Ré, Peng Xu
We propose a simple variant of the power iteration with an added momentum term, that achieves both the optimal sample and iteration complexity.
3 code implementations • ICML 2018 • Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, Leonidas Guibas
Three-dimensional geometric data offer an excellent domain for studying representation learning and generative modeling.
2 code implementations • ICLR 2018 • Jian Zhang, Ioannis Mitliagkas
We revisit the momentum SGD algorithm and show that hand-tuning a single learning rate and momentum makes it competitive with Adam.
no code implementations • 23 Jun 2016 • Jian Zhang, Christopher De Sa, Ioannis Mitliagkas, Christopher Ré
Consider a number of workers running SGD independently on the same pool of data and averaging the models every once in a while -- a common but not well understood practice.
1 code implementation • 14 Jun 2016 • Stefan Hadjis, Ce Zhang, Ioannis Mitliagkas, Dan Iter, Christopher Ré
Given a specification of a convolutional neural network, our goal is to minimize the time to train this model on a cluster of commodity CPUs and GPUs.
no code implementations • NeurIPS 2016 • Bryan He, Christopher De Sa, Ioannis Mitliagkas, Christopher Ré
Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions.
3 code implementations • 31 May 2016 • Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, Christopher Ré
Since asynchronous methods have better hardware efficiency, this result may shed light on when asynchronous execution is more efficient for deep learning systems.
no code implementations • NeurIPS 2013 • Ioannis Mitliagkas, Constantine Caramanis, Prateek Jain
Standard algorithms require $O(p^2)$ memory; meanwhile no algorithm can do better than $O(kp)$ memory, since this is what the output itself requires.