no code implementations • LT4HALA (LREC) 2022 • Margherita Fantoli, Miryam de Lhoneux
This paper describes the process of syntactically parsing the Latin translation by Jacopo da San Cassiano of the Greek mathematical work The Spirals of Archimedes.
1 code implementation • CoNLL (EMNLP) 2021 • Mareike Hartmann, Miryam de Lhoneux, Daniel Hershcovich, Yova Kementchedjhieva, Lukas Nielsen, Chen Qiu, Anders Søgaard
Negation is one of the most fundamental concepts in human cognition and language, and several natural language inference (NLI) probes have been designed to investigate pretrained language models’ ability to detect and reason with negation.
no code implementations • NAACL (AmericasNLP) 2021 • Marcel Bollmann, Rahul Aralikatte, Héctor Murrieta Bello, Daniel Hershcovich, Miryam de Lhoneux, Anders Søgaard
We evaluated a range of neural machine translation techniques developed specifically for low-resource scenarios.
no code implementations • ACL (WAT) 2021 • Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Miryam de Lhoneux, Daniel Hershcovich, Marcel Bollmann, Anders Søgaard
This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization.
1 code implementation • 6 Feb 2024 • Esther Ploeger, Wessel Poelman, Miryam de Lhoneux, Johannes Bjerva
We recommend future work to include an operationalization of 'typological diversity' that empirically justifies the diversity of language samples.
no code implementations • 5 Feb 2024 • Kushal Tatariya, Heather Lent, Johannes Bjerva, Miryam de Lhoneux
Emotion classification is a challenging task in NLP due to the inherent idiosyncratic and subjective nature of linguistic expression, especially with code-mixed data.
1 code implementation • 30 Oct 2023 • Heather Lent, Kushal Tatariya, Raj Dabre, Yiyi Chen, Marcell Fekete, Esther Ploeger, Li Zhou, Ruth-Ann Armstrong, Abee Eijansantos, Catriona Malau, Hans Erik Heje, Ernests Lavrinovics, Diptesh Kanojia, Paul Belony, Marcel Bollmann, Loïc Grobol, Miryam de Lhoneux, Daniel Hershcovich, Michel DeGraff, Anders Søgaard, Johannes Bjerva
Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research. While the genealogical ties between Creoles and a number of highly-resourced languages imply a significant potential for transfer learning, this potential is hampered due to this lack of annotated data.
no code implementations • 20 Feb 2023 • Anders Søgaard, Daniel Hershcovich, Miryam de Lhoneux
Van Miltenburg et al. (2021) suggest NLP research should adopt preregistration to prevent fishing expeditions and to promote publication of negative results.
1 code implementation • 14 Jul 2022 • Phillip Rust, Jonas F. Lotz, Emanuele Bugliarello, Elizabeth Salesky, Miryam de Lhoneux, Desmond Elliott
We pretrain the 86M parameter PIXEL model on the same English data as BERT and evaluate on syntactic and semantic tasks in typologically diverse languages, including various non-Latin scripts.
Ranked #1 on Named Entity Recognition (NER) on MasakhaNER
no code implementations • LREC 2022 • Heather Lent, Kelechi Ogueji, Miryam de Lhoneux, Orevaoghene Ahia, Anders Søgaard
We demonstrate, through conversations with Creole experts and surveys of Creole-speaking communities, how the things needed from language technology can change dramatically from one language to another, even when the languages are considered to be very similar to each other, as with Creoles.
no code implementations • ACL 2022 • Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, Anders Søgaard
Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages.
1 code implementation • ACL 2022 • Victor Milewski, Miryam de Lhoneux, Marie-Francine Moens
In this work, we investigate the knowledge learned in the embeddings of multimodal-BERT models.
1 code implementation • ACL 2022 • Miryam de Lhoneux, Sheng Zhang, Anders Søgaard
Large multilingual pretrained language models such as mBERT and XLM-RoBERTa have been found to be surprisingly effective for cross-lingual transfer of syntactic parsing models (Wu and Dredze 2019), but only between related languages.
1 code implementation • ACL (TLT, SyntaxFest) 2021 • Rob van der Goot, Miryam de Lhoneux
With an increase of dataset availability, the potential for learning from a variety of data sources has increased.
1 code implementation • CoNLL (EMNLP) 2021 • Heather Lent, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu, Anders Søgaard
Creole languages such as Nigerian Pidgin English and Haitian Creole are under-resourced and largely ignored in the NLP literature.
no code implementations • ACL (WAT) 2021 • Rahul Aralikatte, Miryam de Lhoneux, Anoop Kunchukuttan, Anders Søgaard
This work introduces Itihasa, a large-scale translation dataset containing 93, 000 pairs of Sanskrit shlokas and their English translations.
Ranked #1 on Machine Translation on Itihasa
2 code implementations • COLING 2020 • Daniel Hershcovich, Nathan Schneider, Dotan Dvir, Jakob Prange, Miryam de Lhoneux, Omri Abend
Building robust natural language understanding systems will require a clear characterization of whether and how various linguistic meaning representations complement each other.
no code implementations • WS 2020 • Daniel Hershcovich, Miryam de Lhoneux, Artur Kulmizev, Elham Pejhan, Joakim Nivre
We present K{\o}psala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020.
1 code implementation • 25 May 2020 • Daniel Hershcovich, Miryam de Lhoneux, Artur Kulmizev, Elham Pejhan, Joakim Nivre
We present K{\o}psala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020.
no code implementations • IJCNLP 2019 • Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre
Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope.
no code implementations • 20 Aug 2019 • Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre
Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope.
1 code implementation • CL (ACL) 2020 • Miryam de Lhoneux, Sara Stymne, Joakim Nivre
We find that the parser learns different information about AVCs and FMVs if only sequential models (BiLSTMs) are used in the architecture but similar information when a recursive layer is used.
1 code implementation • NAACL 2019 • Miryam de Lhoneux, Miguel Ballesteros, Joakim Nivre
When ablating the forward LSTM, performance drops less dramatically and composition recovers a substantial part of the gap, indicating that a forward LSTM and composition capture similar information.
no code implementations • CONLL 2018 • Aaron Smith, Bernd Bohnet, Miryam de Lhoneux, Joakim Nivre, Yan Shao, Sara Stymne
We present the Uppsala system for the CoNLL 2018 Shared Task on universal dependency parsing.
no code implementations • WS 2018 • Anders Søgaard, Miryam de Lhoneux, Isabelle Augenstein
Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this signal.
1 code implementation • EMNLP 2018 • Miryam de Lhoneux, Johannes Bjerva, Isabelle Augenstein, Anders Søgaard
We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies.
no code implementations • EMNLP 2018 • Aaron Smith, Miryam de Lhoneux, Sara Stymne, Joakim Nivre
We provide a comprehensive analysis of the interactions between pre-trained word embeddings, character models and POS tags in a transition-based dependency parser.
1 code implementation • ACL 2018 • Sara Stymne, Miryam de Lhoneux, Aaron Smith, Joakim Nivre
How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question.
1 code implementation • WS 2017 • Miryam de Lhoneux, Sara Stymne, Joakim Nivre
In this paper, we extend the arc-hybrid system for transition-based parsing with a swap transition that enables reordering of the words and construction of non-projective trees.
no code implementations • CONLL 2017 • Miryam de Lhoneux, Yan Shao, Ali Basirat, Eliyahu Kiperwasser, Sara Stymne, Yoav Goldberg, Joakim Nivre
We present the Uppsala submission to the CoNLL 2017 shared task on parsing from raw text to universal dependencies.
no code implementations • 17 May 2015 • Miryam de Lhoneux
We show that the baseline model performs significantly better on our gold standard when the data are collapsed before parsing than when the data are collapsed after parsing which indicates that there is a parsing effect.