no code implementations • RANLP 2019 • Pierre Andr{\'e} M{\'e}nard, Antoine Mougeot
While high quality gold standard annotated corpora are crucial for most tasks in natural language processing, many annotated corpora published in recent years, created by annotators or tools, contains noisy annotations.
1 code implementation • RANLP 2019 • Aleks, Desislava rova, Fran{\c{c}}ois Lareau, Pierre Andr{\'e} M{\'e}nard
We propose a multilingual method for the extraction of biased sentences from Wikipedia, and use it to create corpora in Bulgarian, French and English.
no code implementations • WS 2016 • Caroline Barri{\`e}re, Pierre Andr{\'e} M{\'e}nard, Daphn{\'e}e Azoulay
We devise an experiment using over 1300 English terms found in scientific articles, and show that our domain-driven TSD algorithm is able to bring the best term record, and therefore the best French equivalent, at the average rank of 1. 69 compared to a baseline random rank of 3. 51.
no code implementations • LREC 2014 • Pierre Andr{\'e} M{\'e}nard, Caroline Barri{\`e}re
This research provides a comparison of a linked open data resource (DBpedia) and web corpus data resources (Google Web Ngrams and Google Books Ngrams) for noun compound bracketing.