Search Results for author: Jenia Jitsev

Found 15 papers, 11 papers with code

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

1 code implementation • 4 Jun 2024 • Marianna Nezhurina, Lucia Cipolina-Kun, Mehdi Cherti, Jenia Jitsev

Large Language Models (LLMs) are often described as being instances of foundation models - that is, models that transfer strongly across various tasks and conditions in few-show or zero-shot manner, while exhibiting scaling laws that predict function improvement when increasing the pre-training scale.

Common Sense Reasoning

Paper
Code

Language models scale reliably with over-training and on downstream tasks

1 code implementation • 13 Mar 2024 • Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Alexandros G. Dimakis, Gabriel Ilharco, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt

We fit scaling laws that extrapolate in both the number of model parameters and the ratio of training tokens to parameters.

Language Modelling

Paper
Code

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

2 code implementations • 2 Aug 2023 • Anas Awadalla, Irena Gao, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, Jenia Jitsev, Simon Kornblith, Pang Wei Koh, Gabriel Ilharco, Mitchell Wortsman, Ludwig Schmidt

We introduce OpenFlamingo, a family of autoregressive vision-language models ranging from 3B to 9B parameters.

Ranked #14 on Visual Question Answering (VQA) on InfiMM-Eval

Visual Question Answering

3,520

Paper
Code

DataComp: In search of the next generation of multimodal datasets

1 code implementation • NeurIPS 2023 • Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt

Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms.

Paper
Code

A Comparative Study on Generative Models for High Resolution Solar Observation Imaging

1 code implementation • 14 Apr 2023 • Mehdi Cherti, Alexander Czernik, Stefan Kesselheim, Frederic Effenberger, Jenia Jitsev

Starting from StyleGAN-based methods, we uncover severe deficits of this model family in handling fine-scale details of solar images when training on high resolution samples, contrary to training on natural face images.

Paper
Code

Reproducible scaling laws for contrastive language-image learning

3 code implementations • CVPR 2023 • Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, Jenia Jitsev

To address these limitations, we investigate scaling laws for contrastive language-image pre-training (CLIP) with the public LAION dataset and the open-source OpenCLIP repository.

Ranked #1 on Zero-Shot Image Classification on Country211 (using extra training data)

Image Classification Open Vocabulary Attribute Detection +4

8,853

Paper
Code

Towards prediction of turbulent flows at high Reynolds numbers using high performance computing data and deep learning

no code implementations • 28 Oct 2022 • Mathis Bode, Michael Gauding, Jens Henrik Göbbert, Baohao Liao, Jenia Jitsev, Heinz Pitsch

In this paper, deep learning (DL) methods are evaluated in the context of turbulent flows.

Vocal Bursts Intensity Prediction

Paper
Add Code

LAION-5B: An open large-scale dataset for training next generation image-text models

3 code implementations • NeurIPS 2022 Datasets and Benchmarks 2022 • Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev

We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale.

Image Generation Zero-Shot Learning

8,853

Paper
Code

LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs

2 code implementations • 3 Nov 2021 • Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, Aran Komatsuzaki

Multi-modal language-vision models trained on hundreds of millions of image-text pairs (e. g.

Few-Shot Learning

10,874

Paper
Code

JUWELS Booster -- A Supercomputer for Large-Scale AI Research

no code implementations • 30 Jun 2021 • Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel, Thomas Lippert

In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the J\"ulich Supercomputing Center.

Paper
Add Code

Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images

1 code implementation • 31 May 2021 • Mehdi Cherti, Jenia Jitsev

We then conduct supervised pre-training, while varying network size and source data scale and domain, being either large natural (ImageNet-1k/21k) or large medical chest X-Ray datasets, and transfer pre-trained models to different natural or medical targets.

Ranked #2 on Image Classification on Oxford-IIIT Pet Dataset

Few-Shot Learning Image Classification +2

Paper
Code

Generalization over different cellular automata rules learned by a deep feed-forward neural network

1 code implementation • 27 Mar 2021 • Marcel Aach, Jens Henrik Goebbert, Jenia Jitsev

To test generalization ability of a class of deep neural networks, we randomly generate a large number of different rule sets for 2-D cellular automata (CA), based on John Conway's Game of Life.

Decoder

Paper
Code

Adversarial domain adaptation to reduce sample bias of a high energy physics classifier

no code implementations • 1 May 2020 • Jose M. Clavijo, Paul Glaysher, Judith M. Katzy, Jenia Jitsev

We apply adversarial domain adaptation in unsupervised setting to reduce sample bias in a supervised high energy physics events classifier training.

BIG-bench Machine Learning Classification +3

Paper
Add Code

Obstacle Tower Without Human Demonstrations: How Far a Deep Feed-Forward Network Goes with Reinforcement Learning

1 code implementation • 1 Apr 2020 • Marco Pleines, Jenia Jitsev, Mike Preuss, Frank Zimmer

The Obstacle Tower Challenge is the task to master a procedurally generated chain of levels that subsequently get harder to complete.

Paper
Code

Using Physics-Informed Super-Resolution Generative Adversarial Networks for Subgrid Modeling in Turbulent Reactive Flows

no code implementations • 26 Nov 2019 • Mathis Bode, Michael Gauding, Zeyu Lian, Dominik Denker, Marco Davidovic, Konstantin Kleinheinz, Jenia Jitsev, Heinz Pitsch

Reasons for this are the large amount of degrees of freedom in realistic flows, the high requirements with respect to accuracy and error robustness, as well as open questions, such as the generalization capability of trained neural networks in such high-dimensional, physics-constrained scenarios.

Generative Adversarial Network Super-Resolution

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.