1 code implementation • 17 Oct 2023 • Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Marc Dymetman
As language models (LMs) become more capable, it is increasingly important to align them with human preferences.
1 code implementation • 30 Jun 2023 • Nadezhda Chirkova, Germán Kruszewski, Jos Rozen, Marc Dymetman
Autoregressive language models (LMs) map token sequences to probabilities.
1 code implementation • 8 Mar 2023 • Germán Kruszewski, Jos Rozen, Marc Dymetman
Pre-trained language models and other generative models have revolutionized NLP and beyond.
1 code implementation • 16 Feb 2023 • Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Nahyeon Ryu, Marc Dymetman
We show that Jensen-Shannon divergence strikes a good balance between these objectives, and frequently outperforms forward KL divergence by a wide margin, leading to significant improvements over prior work.
2 code implementations • 1 Jun 2022 • Tomasz Korbak, Hady Elsahar, Germán Kruszewski, Marc Dymetman
Here we explore the theoretical connections between the two paradigms, and show that methods such as KL-control developed for RM can also be construed as belonging to DM.
no code implementations • 10 Dec 2021 • Bryan Eikema, Germán Kruszewski, Hady Elsahar, Marc Dymetman
We show that we can sample from such EBMs with arbitrary precision at the cost of sampling efficiency.
1 code implementation • 1 Dec 2021 • Tomasz Korbak, Hady Elsahar, German Kruszewski, Marc Dymetman
Machine learning is shifting towards general-purpose pretrained generative models, trained in a self-supervised manner on large amounts of data, which can then be applied to solve a large number of tasks.
no code implementations • 29 Sep 2021 • Tomasz Korbak, Hady Elsahar, Germán Kruszewski, Marc Dymetman
The availability of large pre-trained models is changing the landscape of Machine Learning research and practice, moving from a "training from scratch" to a "fine-tuning'' paradigm.
1 code implementation • 9 Jun 2021 • Tomasz Korbak, Hady Elsahar, Marc Dymetman, Germán Kruszewski
Neural language models can be successfully trained on source code, leading to applications such as code completion.
1 code implementation • ICLR 2021 • Muhammad Khalifa, Hady Elsahar, Marc Dymetman
From that optimal representation we then train a target controlled Autoregressive LM through an adaptive distributional variant of Policy Gradient.
1 code implementation • 18 Dec 2019 • Tetiana Parshakova, Jean-Marc Andreoli, Marc Dymetman
Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models.
Distributional Reinforcement Learning reinforcement-learning +1
no code implementations • 12 Nov 2019 • Rohit Gupta, Laurent Besacier, Marc Dymetman, Matthias Gallé
Character-based translation has several appealing advantages, but its performance is in general worse than a carefully tuned BPE baseline.
no code implementations • WS 2019 • Alexandre Bérard, Ioan Calapodescu, Marc Dymetman, Claude Roux, Jean-Luc Meunier, Vassilina Nikoulina
We share a French-English parallel corpus of Foursquare restaurant reviews (https://europe. naverlabs. com/research/natural-language-processing/machine-translation-of-restaurant-reviews), and define a new task to encourage research on Neural Machine Translation robustness and domain adaptation, in a real-world scenario where better-quality MT would be greatly beneficial.
1 code implementation • CONLL 2019 • Tetiana Parshakova, Jean-Marc Andreoli, Marc Dymetman
In the second step, we use this GAM to train (by distillation) a second autoregressive model that approximates the \emph{normalized} distribution associated with the GAM, and can be used for fast inference and evaluation.
no code implementations • 24 Dec 2018 • Cong Duy Vu Hoang, Ioan Calapodescu, Marc Dymetman
In previous works, neural sequence models have been shown to improve significantly if external prior knowledge can be provided, for instance by allowing the model to access the embeddings of explicit features during both training and inference.
no code implementations • WS 2018 • Shubham Agarwal, Marc Dymetman, Eric Gaussier
This paper describes our submission to the E2E NLG Challenge.
1 code implementation • 20 Sep 2018 • Chunyang Xiao, Marc Dymetman, Claire Gardent
Seq2seq models based on Recurrent Neural Networks (RNNs) have recently received a lot of attention in the domain of Semantic Parsing for Question Answering.
1 code implementation • WS 2017 • Shubham Agarwal, Marc Dymetman
We train a char2char model on the E2E NLG Challenge data, by exploiting {``}out-of-the-box{''} the recently released tfseq2seq framework, using some of the standard options offered by this tool.
no code implementations • COLING 2016 • Raghav Goyal, Marc Dymetman, Eric Gaussier
Recently Wen et al. (2015) have proposed a Recurrent Neural Network (RNN) approach to the generation of utterances from dialog acts, and shown that although their model requires less effort to develop than a rule-based system, it is able to improve certain aspects of the utterances, in particular their naturalness.
no code implementations • 8 Jul 2016 • Marc Dymetman, Chunyang Xiao
We introduce LL-RNNs (Log-Linear RNNs), an extension of Recurrent Neural Networks that replaces the softmax output layer by a log-linear output layer, of which the softmax is a special case.
no code implementations • WS 2016 • Phong Le, Marc Dymetman, Jean-Michel Renders
We introduce an LSTM-based method for dynamically integrating several word-prediction experts to obtain a conditional language model which can be good simultaneously at several subtasks.
no code implementations • 7 Oct 2015 • Spandana Gella, Marc Dymetman, Jean Michel Renders, Sriram Venkatapathy
The experimental results on a large email collection from a contact center in the tele- com domain show that the proposed ap- proach is effective in predicting the best topic of the agent's next sentence.
no code implementations • JEPTALNRECITAL 2015 • Christophe Servan, Marc Dymetman
Nous pr{\'e}sentons des travaux pr{\'e}liminaires sur une approche permettant d{'}ajouter des termes bilingues {\`a} un syst{\`e}me de Traduction Automatique Statistique (TAS) {\`a} base de segments.