no code implementations • LREC 2020 • Thierry Etchegoyhen, Borja Anza Porras, Andoni Azpeitia, Eva Mart{\'\i}nez Garcia, Jos{\'e} Luis Fonseca, Patricia Fonseca, Paulo Vale, Jane Dunne, Federico Gaspari, Teresa Lynn, Helen McHugh, Andy Way, Victoria Arranz, Khalid Choukri, Herv{\'e} Pusset, Alex Sicard, re, Rui Neto, Maite Melero, David Perez, Ant{\'o}nio Branco, Ruben Branco, Lu{\'\i}s Gomes
We describe the European Language Resource Infrastructure (ELRI), a decentralised network to help collect, prepare and share language resources.
no code implementations • WS 2018 • Andoni Azpeitia, Thierry Etchegoyhen, Eva Mart{\'\i}nez Garcia
To address the specifics of the corpus filtering task, which features significant volumes of noisy data, the core method was expanded with a penalty based on the amount of unknown words in sentence pairs.
no code implementations • WS 2018 • Thierry Etchegoyhen, Eva Mart{\'\i}nez Garcia, Andoni Azpeitia
We describe Vicomtech{'}s participation in the WMT 2018 shared task on quality estimation, for which we submitted minimalist quality estimators.
no code implementations • WS 2017 • Andoni Azpeitia, Thierry Etchegoyhen, Eva Mart{\'\i}nez Garcia
This article presents the STACCw system for the BUCC 2017 shared task on parallel sentence extraction from comparable corpora.
no code implementations • LREC 2016 • Thierry Etchegoyhen, Andoni Azpeitia, Naiara P{\'e}rez
The EITB corpus, a strongly comparable corpus in the news domain, is to be shared with the research community, as an aid for the development and testing of methods in comparable corpora exploitation, and as basis for the improvement of data-driven machine translation systems for this language pair.
no code implementations • LREC 2014 • Isa Maks, Ruben Izquierdo, Francesca Frontini, Rodrigo Agerri, Piek Vossen, Andoni Azpeitia
In this paper we focus on the creation of general-purpose (as opposed to domain-specific) polarity lexicons in five languages: French, Italian, Dutch, English and Spanish using WordNet propagation.