Search Results for author: Ioannis Mitliagkas

Found 52 papers, 30 papers with code

Gradient descent induces alignment between weights and the empirical NTK for deep non-linear networks

no code implementations • 7 Feb 2024 • Daniel Beaglehole, Ioannis Mitliagkas, Atish Agarwala

Prior works have identified that the gram matrices of the weights in trained neural networks of general architectures are proportional to the average gradient outer product of the model, in a statement known as the Neural Feature Ansatz (NFA).

Paper
Add Code

Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection

1 code implementation • 22 Aug 2023 • Charles Guille-Escuret, Pierre-André Noël, Ioannis Mitliagkas, David Vazquez, Joao Monteiro

Our findings reveal that while these methods excel in detecting unknown classes, their performance is inconsistent when encountering other types of distribution shifts.

Benchmarking Out-of-Distribution Detection

Paper
Code

An Empirical Investigation of Pre-trained Model Selection for Out-of-Distribution Generalization and Calibration

1 code implementation • 17 Jul 2023 • Hiroki Naganuma, Ryuichiro Hataya, Ioannis Mitliagkas

In the realm of out-of-distribution (OOD) generalization tasks, fine-tuning pre-trained models has become a prevalent strategy.

Memorization Model Selection +1

Paper
Code

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths

1 code implementation • 20 Jun 2023 • Charles Guille-Escuret, Hiroki Naganuma, Kilian Fatras, Ioannis Mitliagkas

Understanding the optimization dynamics of neural networks is necessary for closing the gap between theory and practice.

Image Classification Language Modelling +1

Paper
Code

Performative Prediction with Neural Networks

1 code implementation • 14 Apr 2023 • Mehrnaz Mofakhami, Ioannis Mitliagkas, Gauthier Gidel

In this work, we instead assume that the data distribution is Lipschitz continuous with respect to the model's predictions, a more natural assumption for performative systems.

Paper
Code

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

1 code implementation • 26 Nov 2022 • Sébastien Lachapelle, Tristan Deleu, Divyat Mahajan, Ioannis Mitliagkas, Yoshua Bengio, Simon Lacoste-Julien, Quentin Bertrand

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited.

Disentanglement Meta-Learning +1

Paper
Code

Empirical Study on Optimizer Selection for Out-of-Distribution Generalization

1 code implementation • 15 Nov 2022 • Hiroki Naganuma, Kartik Ahuja, Shiro Takagi, Tetsuya Motokawa, Rio Yokota, Kohta Ishikawa, Ikuro Sato, Ioannis Mitliagkas

Modern deep learning systems do not generalize well when the test data distribution is slightly different to the training data distribution.

Out-of-Distribution Generalization text-classification +1

Paper
Code

Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation

1 code implementation • 3 Nov 2022 • Divyat Mahajan, Ioannis Mitliagkas, Brady Neal, Vasilis Syrgkanis

We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation.

AutoML Causal Inference +2

Paper
Code

Towards Out-of-Distribution Adversarial Robustness

1 code implementation • 6 Oct 2022 • Adam Ibrahim, Charles Guille-Escuret, Ioannis Mitliagkas, Irina Rish, David Krueger, Pouya Bashivan

Compared to existing methods, we obtain similar or superior worst-case adversarial robustness on attacks seen during training.

Adversarial Robustness

Paper
Code

A Reproducible and Realistic Evaluation of Partial Domain Adaptation Methods

no code implementations • 3 Oct 2022 • Tiago Salvador, Kilian Fatras, Ioannis Mitliagkas, Adam Oberman

In this work, we consider the Partial Domain Adaptation (PDA) variant, where we have extra source classes not present in the target domain.

Model Selection Partial Domain Adaptation +1

Paper
Add Code

Neural Networks Efficiently Learn Low-Dimensional Representations with SGD

no code implementations • 29 Sep 2022 • Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, Murat A. Erdogdu

We further demonstrate that, SGD-trained ReLU NNs can learn a single-index target of the form $y=f(\langle\boldsymbol{u},\boldsymbol{x}\rangle) + \epsilon$ by recovering the principal direction, with a sample complexity linear in $d$ (up to log factors), where $f$ is a monotonic function with at most polynomial growth, and $\epsilon$ is the noise.

Paper
Add Code

Optimal transport meets noisy label robust loss and MixUp regularization for domain adaptation

no code implementations • 22 Jun 2022 • Kilian Fatras, Hiroki Naganuma, Ioannis Mitliagkas

It is common in computer vision to be confronted with domain shift: images which have the same class but different acquisition conditions.

Domain Adaptation

Paper
Add Code

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

3 code implementations • 12 Jun 2022 • Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm.

MuJoCo Games reinforcement-learning +1

4,018

Paper
Code

Towards efficient representation identification in supervised learning

1 code implementation • 10 Apr 2022 • Kartik Ahuja, Divyat Mahajan, Vasilis Syrgkanis, Ioannis Mitliagkas

In this work, we depart from these assumptions and ask: a) How can we get disentanglement when the auxiliary information does not provide conditional independence over the factors of variation?

Disentanglement

Paper
Code

Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize

no code implementations • 28 Oct 2021 • Ryan D'Orazio, Nicolas Loizou, Issam Laradji, Ioannis Mitliagkas

We investigate the convergence of stochastic mirror descent (SMD) under interpolation in relatively smooth and smooth convex optimization.

Paper
Add Code

Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks

no code implementations • 20 Oct 2021 • Manuela Girotti, Ioannis Mitliagkas, Gauthier Gidel

We theoretically analyze the Feedback Alignment (FA) algorithm, an efficient alternative to backpropagation for training neural networks.

Incremental Learning

Paper
Add Code

Gotta Go Fast with Score-Based Generative Models

no code implementations • NeurIPS Workshop DLDE 2021 • Alexia Jolicoeur-Martineau, Ke Li, Rémi Piché-Taillefer, Tal Kachman, Ioannis Mitliagkas

Score-based (denoising diffusion) generative models have recently gained a lot of success in generating realistic and diverse data.

Denoising

Paper
Add Code

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

1 code implementation • NeurIPS 2021 • Nicolas Loizou, Hugo Berard, Gauthier Gidel, Ioannis Mitliagkas, Simon Lacoste-Julien

Two of the most prominent algorithms for solving unconstrained smooth games are the classical stochastic gradient descent-ascent (SGDA) and the recently introduced stochastic consensus optimization (SCO) [Mescheder et al., 2017].

Paper
Code

Invariance Principle Meets Information Bottleneck for Out-of-Distribution Generalization

2 code implementations • NeurIPS 2021 • Kartik Ahuja, Ethan Caballero, Dinghuai Zhang, Jean-Christophe Gagnon-Audet, Yoshua Bengio, Ioannis Mitliagkas, Irina Rish

To answer these questions, we revisit the fundamental assumptions in linear regression tasks, where invariance-based approaches were shown to provably generalize OOD.

Out-of-Distribution Generalization regression

1,336

Paper
Code

Gotta Go Fast When Generating Data with Score-Based Models

1 code implementation • 28 May 2021 • Alexia Jolicoeur-Martineau, Ke Li, Rémi Piché-Taillefer, Tal Kachman, Ioannis Mitliagkas

For high-resolution images, our method leads to significantly higher quality samples than all other methods tested.

Ranked #8 on Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Paper
Code

A Study of Condition Numbers for First-Order Optimization

no code implementations • 10 Dec 2020 • Charles Guille-Escuret, Baptiste Goujaud, Manuela Girotti, Ioannis Mitliagkas

Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous.

Paper
Add Code

LEAD: Min-Max Optimization from a Physical Perspective

1 code implementation • 26 Oct 2020 • Reyhane Askari Hemmat, Amartya Mitra, Guillaume Lajoie, Ioannis Mitliagkas

A central obstacle in the optimization of such games is the rotational dynamics that hinder their convergence.

Image Generation

Paper
Code

In Search of Robust Measures of Generalization

1 code implementation • NeurIPS 2020 • Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal, Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Daniel M. Roy

A large volume of work aims to close this gap, primarily by developing bounds on generalization error, optimization error, and excess risk.

Generalization Bounds

Paper
Code

Adversarial score matching and improved sampling for image generation

1 code implementation • ICLR 2021 • Alexia Jolicoeur-Martineau, Rémi Piché-Taillefer, Rémi Tachet des Combes, Ioannis Mitliagkas

Denoising Score Matching with Annealed Langevin Sampling (DSM-ALS) has recently found success in generative modeling.

Ranked #54 on Image Generation on CIFAR-10

Denoising Image Generation

121

Paper
Code

Stochastic Hamiltonian Gradient Methods for Smooth Games

no code implementations • ICML 2020 • Nicolas Loizou, Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent, Simon Lacoste-Julien, Ioannis Mitliagkas

The success of adversarial formulations in machine learning has brought renewed motivation for smooth games.

BIG-bench Machine Learning

Paper
Add Code

Accelerating Smooth Games by Manipulating Spectral Shapes

no code implementations • 2 Jan 2020 • Waïss Azizian, Damien Scieur, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

Using this perspective, we propose an optimal algorithm for bilinear games.

Paper
Add Code

Generalizing to unseen domains via distribution matching

2 code implementations • 3 Nov 2019 • Isabela Albuquerque, João Monteiro, Mohammad Darvishi, Tiago H. Falk, Ioannis Mitliagkas

In this work, we tackle such problem by focusing on domain generalization: a formalization where the data generating process at test time may yield samples from never-before-seen domains (distributions).

Ranked #61 on Domain Generalization on PACS

Domain Generalization LEMMA +4

Paper
Code

Gradient penalty from a maximum margin perspective

2 code implementations • 15 Oct 2019 • Alexia Jolicoeur-Martineau, Ioannis Mitliagkas

We present a unifying framework of expected margin maximization and show that a wide range of gradient-penalized GANs (e. g., Wasserstein, Standard, Least-Squares, and Hinge GANs) can be derived from this framework.

Ranked #131 on Image Generation on CIFAR-10

Image Generation

3,652

Paper
Code

Linear Lower Bounds and Conditioning of Differentiable Games

no code implementations • ICML 2020 • Adam Ibrahim, Waïss Azizian, Gauthier Gidel, Ioannis Mitliagkas

In this work, we approach the question of fundamental iteration complexity by providing lower bounds to complement the linear (i. e. geometric) upper bounds observed in the literature on a wide class of problems.

Paper
Add Code

A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Games

no code implementations • 13 Jun 2019 • Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

We provide new analyses of the EG's local and global convergence properties and use is to get a tighter global convergence rate for OG and CO. Our analysis covers the whole range of settings between bilinear and strongly monotone games.

Paper
Add Code

Reducing the variance in online optimization by transporting past gradients

1 code implementation • NeurIPS 2019 • Sébastien M. R. Arnold, Pierre-Antoine Manzagol, Reza Babanezhad, Ioannis Mitliagkas, Nicolas Le Roux

While variance reduction methods have shown that reusing past gradients can be beneficial when there is a finite number of datapoints, they do not easily extend to the online setting.

Stochastic Optimization

Paper
Code

State-Reification Networks: Improving Generalization by Modeling the Distribution of Hidden Representations

no code implementations • 26 May 2019 • Alex Lamb, Jonathan Binas, Anirudh Goyal, Sandeep Subramanian, Ioannis Mitliagkas, Denis Kazakov, Yoshua Bengio, Michael C. Mozer

Machine learning promises methods that generalize well from finite labeled data.

Paper
Add Code

In Support of Over-Parametrization in Deep Reinforcement Learning: an Empirical Study

no code implementations • ICML Workshop Deep_Phenomen 2019 • Brady Neal, Ioannis Mitliagkas

There is significant recent evidence in supervised learning that, in the over-parametrized setting, wider networks achieve better test error.

OpenAI Gym reinforcement-learning +1

Paper
Add Code

Manifold Mixup: Learning Better Representations by Interpolating Hidden States

1 code implementation • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Aaron Courville, Ioannis Mitliagkas, Yoshua Bengio

Because the hidden states are learned, this has an important effect of encouraging the hidden states for a class to be concentrated in such a way so that interpolations within the same class or between two different classes do not intersect with the real data points from other classes.

478

Paper
Code

MLSys: The New Frontier of Machine Learning Systems

no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar

Machine learning (ML) techniques are enjoying rapidly increasing adoption.

BIG-bench Machine Learning

Paper
Add Code

Multi-objective training of Generative Adversarial Networks with multiple discriminators

1 code implementation • ICLR 2019 • Isabela Albuquerque, João Monteiro, Thang Doan, Breandan Considine, Tiago Falk, Ioannis Mitliagkas

Recent literature has demonstrated promising results for training Generative Adversarial Networks by employing a set of discriminators, in contrast to the traditional game involving one generator against a single adversary.

Paper
Code

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

no code implementations • 19 Oct 2018 • Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas

The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve.

Paper
Add Code

h-detach: Modifying the LSTM Gradient Towards Better Optimization

1 code implementation • ICLR 2019 • Devansh Arpit, Bhargav Kanuparthi, Giancarlo Kerg, Nan Rosemary Ke, Ioannis Mitliagkas, Yoshua Bengio

This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number of steps.

Paper
Code

Negative Momentum for Improved Game Dynamics

1 code implementation • 12 Jul 2018 • Gauthier Gidel, Reyhane Askari Hemmat, Mohammad Pezeshki, Remi Lepriol, Gabriel Huang, Simon Lacoste-Julien, Ioannis Mitliagkas

Games generalize the single-objective optimization paradigm by introducing different objective functions for different players.

Paper
Code

Manifold Mixup: Better Representations by Interpolating Hidden States

12 code implementations • ICLR 2019 • Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples.

Ranked #18 on Image Classification on OmniBenchmark

Image Classification

576

Paper
Code

Fortified Networks: Improving the Robustness of Deep Networks by Modeling the Manifold of Hidden Representations

1 code implementation • ICLR 2019 • Alex Lamb, Jonathan Binas, Anirudh Goyal, Dmitriy Serdyuk, Sandeep Subramanian, Ioannis Mitliagkas, Yoshua Bengio

Deep networks have achieved impressive results across a variety of important tasks.

Paper
Code

Learning Generative Models with Locally Disentangled Latent Factors

no code implementations • ICLR 2018 • Brady Neal, Alex Lamb, Sherjil Ozair, Devon Hjelm, Aaron Courville, Yoshua Bengio, Ioannis Mitliagkas

One of the most successful techniques in generative models has been decomposing a complicated generation task into a series of simpler generation tasks.

Paper
Add Code

Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data

no code implementations • 17 Aug 2017 • Thorsten Kurth, Jian Zhang, Nadathur Satish, Ioannis Mitliagkas, Evan Racah, Mostofa Ali Patwary, Tareq Malas, Narayanan Sundaram, Wahid Bhimji, Mikhail Smorkalov, Jack Deslippe, Mikhail Shiryaev, Srinivas Sridharan, Prabhat, Pradeep Dubey

This paper presents the first, 15-PetaFLOP Deep Learning system for solving scientific pattern classification problems on contemporary HPC architectures.

General Classification

Paper
Add Code

Improving Gibbs Sampler Scan Quality with DoGS

no code implementations • ICML 2017 • Ioannis Mitliagkas, Lester Mackey

The pairwise influence matrix of Dobrushin has long been used as an analytical tool to bound the rate of convergence of Gibbs sampling.

Image Segmentation Object Recognition +2

Paper
Add Code

Accelerated Stochastic Power Iteration

2 code implementations • 10 Jul 2017 • Christopher De Sa, Bryan He, Ioannis Mitliagkas, Christopher Ré, Peng Xu

We propose a simple variant of the power iteration with an added momentum term, that achieves both the optimal sample and iteration complexity.

Dimensionality Reduction

347

Paper
Code

Learning Representations and Generative Models for 3D Point Clouds

3 code implementations • ICML 2018 • Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, Leonidas Guibas

Three-dimensional geometric data offer an excellent domain for studying representation learning and generative modeling.

Representation Learning

502

Paper
Code

YellowFin and the Art of Momentum Tuning

2 code implementations • ICLR 2018 • Jian Zhang, Ioannis Mitliagkas

We revisit the momentum SGD algorithm and show that hand-tuning a single learning rate and momentum makes it competitive with Adam.

Constituency Parsing Language Modelling

422

Paper
Code

Parallel SGD: When does averaging help?

no code implementations • 23 Jun 2016 • Jian Zhang, Christopher De Sa, Ioannis Mitliagkas, Christopher Ré

Consider a number of workers running SGD independently on the same pool of data and averaging the models every once in a while -- a common but not well understood practice.

Paper
Add Code

Omnivore: An Optimizer for Multi-device Deep Learning on CPUs and GPUs

1 code implementation • 14 Jun 2016 • Stefan Hadjis, Ce Zhang, Ioannis Mitliagkas, Dan Iter, Christopher Ré

Given a specification of a convolutional neural network, our goal is to minimize the time to train this model on a cluster of commodity CPUs and GPUs.

Paper
Code

Scan Order in Gibbs Sampling: Models in Which it Matters and Bounds on How Much

no code implementations • NeurIPS 2016 • Bryan He, Christopher De Sa, Ioannis Mitliagkas, Christopher Ré

Gibbs sampling is a Markov Chain Monte Carlo sampling technique that iteratively samples variables from their conditional distributions.

Paper
Add Code

Asynchrony begets Momentum, with an Application to Deep Learning

3 code implementations • 31 May 2016 • Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, Christopher Ré

Since asynchronous methods have better hardware efficiency, this result may shed light on when asynchronous execution is more efficient for deep learning systems.

623

Paper
Code

Memory Limited, Streaming PCA

no code implementations • NeurIPS 2013 • Ioannis Mitliagkas, Constantine Caramanis, Prateek Jain

Standard algorithms require $O(p^2)$ memory; meanwhile no algorithm can do better than $O(kp)$ memory, since this is what the output itself requires.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.