no code implementations • LREC 2022 • Amir Hazem, Merieme Bouhandi, Florian Boudin, Beatrice Daille
Automatic Term Extraction (ATE) is a key component for domain knowledge understanding and an important basis for further natural language processing applications.
1 code implementation • 26 Feb 2024 • Adrien Bazoge, Emmanuel Morin, Beatrice Daille, Pierre-Antoine Gourraud
Recently, pretrained language models based on BERT have been introduced for the French biomedical domain.
no code implementations • 22 Feb 2024 • Yanis Labrak, Adrien Bazoge, Beatrice Daille, Mickael Rouvier, Richard Dufour
Subword tokenization has become the prevailing standard in the field of natural language processing (NLP) over recent years, primarily due to the widespread utilization of pre-trained language models.
1 code implementation • 20 Feb 2024 • Yanis Labrak, Adrien Bazoge, Oumaima El Khettari, Mickael Rouvier, Pacome Constant dit Beaufils, Natalia Grabar, Beatrice Daille, Solen Quiniou, Emmanuel Morin, Pierre-Antoine Gourraud, Richard Dufour
This limitation hampers the evaluation of the latest French biomedical models, as they are either assessed on a minimal number of tasks with non-standardized protocols or evaluated using general downstream tasks.
1 code implementation • 22 Nov 2022 • Mael Houbre, Florian Boudin, Beatrice Daille
Keyphrase generation is the task consisting in generating a set of words or phrases that highlight the main topics of a document.
1 code implementation • COLING 2020 • Amir Hazem, Beatrice Daille, Dominique Stutzmann, Christopher Kermorvant, Louis Chevalier
In this paper, we address the segmentation of books of hours, Latin devotional manuscripts of the late Middle Ages, that exhibit challenging issues: a complex hierarchical entangled structure, variable content, noisy transcriptions with no sentence markers, and strong correlations between sections for which topical information is no longer sufficient to draw segmentation boundaries.
no code implementations • LREC 2020 • Amir Hazem, Bouh, M{\'e}rieme i, Florian Boudin, Beatrice Daille
Automatic terminology extraction is a notoriously difficult task aiming to ease effort demanded to manually identify terms in domain-specific corpora by automatically providing a ranked list of candidate terms.
no code implementations • LREC 2020 • Amir Hazem, Beatrice Daille, Lanza Claudia
Thesaurus construction with minimum human efforts often relies on automatic methods to discover terms and their relations.
no code implementations • LREC 2020 • Yizhe WANG, Beatrice Daille, Nabil Hathout
The semantic projection method is often used in terminology structuring to infer semantic relations between terms.
no code implementations • JEPTALNRECITAL 2016 • Adrien Bougouin, Florian Boudin, Beatrice Daille
Dans cet article, nous nous int{\'e}ressons {\`a} l{'}indexation de documents de domaines de sp{\'e}cialit{\'e} par l{'}interm{\'e}diaire de leurs termes-cl{\'e}s. Plus particuli{\`e}rement, nous nous int{\'e}ressons {\`a} l{'}indexation telle qu{'}elle est r{\'e}alis{\'e}e par les documentalistes de biblioth{\`e}ques num{\'e}riques.