no code implementations • LREC 2020 • Ant{\'o}nio Branco, Am{\'a}lia Mendes, Paulo Quaresma, Lu{\'\i}s Gomes, Jo{\~a}o Silva, Andrea Teixeira
This paper presents the PORTULAN CLARIN Research Infrastructure for the Science and Technology of Language, which is part of the European research infrastructure CLARIN ERIC as its Portuguese national node, and belongs to the Portuguese National Roadmap of Research Infrastructures of Strategic Relevance.
no code implementations • LREC 2020 • Ant{\'o}nio Branco, Nicoletta Calzolari, Piek Vossen, Gertjan van Noord, Dieter van Uytvanck, Jo{\~a}o Silva, Lu{\'\i}s Gomes, Andr{\'e} Moreira, Willem Elbers
n this paper, we introduce a new type of shared task {---} which is collaborative rather than competitive {---} designed to support and fosterthe reproduction of research results.
no code implementations • LREC 2020 • Jo{\~a}o Ant{\'o}nio Rodrigues, Ruben Branco, Jo{\~a}o Silva, Ant{\'o}nio Branco
Given a recent publication that pointed out spurious statistical cues in the data set used in the shared task, and that produced a revised version of it, we also evaluated the reproduced systems with this new data set.
no code implementations • LREC 2020 • Ant{\'o}nio Branco, Sara Grilo, M{\'a}rcia Bolrinha, Chakaveh Saedi, Ruben Branco, Jo{\~a}o Silva, Andreia Querido, Rita de Carvalho, Rosa Gaudio, Mariana Avel{\~a}s, Clara Pinto
The objective of the present paper is twofold, to present the MWN. PT WordNet and to report on its construction and on the lessons learned with it.
no code implementations • LREC 2020 • Sara Grilo, M{\'a}rcia Bolrinha, Jo{\~a}o Silva, Rui Vaz, Ant{\'o}nio Branco
This paper presents the BDCam{\~o}es Collection of Portuguese Literary Documents, a new corpus of literary texts written in Portuguese that in its inaugural version includes close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a time span from the 16th to the 21st century, and adhering to different orthographic conventions.
1 code implementation • WS 2018 • Chakaveh Saedi, Ant{\'o}nio Branco, Jo{\~a}o Ant{\'o}nio Rodrigues, Jo{\~a}o Silva
Semantic networks and semantic spaces have been two prominent approaches to represent lexical semantics.
1 code implementation • WS 2018 • Jo{\~a}o Ant{\'o}nio Rodrigues, Ruben Branco, Jo{\~a}o Silva, Chakaveh Saedi, Ant{\'o}nio Branco
The task of taking a semantic representation of a noun and predicting the brain activity triggered by it in terms of fMRI spatial patterns was pioneered by Mitchell et al. 2008.
no code implementations • SEMEVAL 2017 • Jo{\~a}o Ant{\'o}nio Rodrigues, Chakaveh Saedi, Vladislav Maraev, Jo{\~a}o Silva, Ant{\'o}nio Branco
This paper presents the results of systematic experimentation on the impact in duplicate question detection of different types of questions across both a number of established approaches and a novel, superior one used to address this language processing task.
no code implementations • WS 2016 • Rosa Gaudio, Gorka Labaka, Eneko Agirre, Petya Osenova, Kiril Simov, Martin Popel, Dieke Oele, Gertjan van Noord, Lu{\'\i}s Gomes, Jo{\~a}o Ant{\'o}nio Rodrigues, Steven Neale, Jo{\~a}o Silva, Andreia Querido, Nuno Rendeiro, Ant{\'o}nio Branco
no code implementations • LREC 2016 • Arantxa Otegi, Nora Aranberri, Antonio Branco, Jan Haji{\v{c}}, Martin Popel, Kiril Simov, Eneko Agirre, Petya Osenova, Rita Pereira, Jo{\~a}o Silva, Steven Neale
This work presents parallel corpora automatically annotated with several NLP tools, including lemma and part-of-speech tagging, named-entity recognition and classification, named-entity disambiguation, word-sense disambiguation, and coreference.
no code implementations • LREC 2016 • Rita de Carvalho, Andreia Querido, Marisa Campos, Rita Valadas Pereira, Jo{\~a}o Silva, Ant{\'o}nio Branco
This paper presents a new linguistic resource for the study and computational processing of Portuguese.
no code implementations • LREC 2012 • Jo{\~a}o Silva, Lu{\'\i}sa Coheur, {\^A}ngela Costa, Isabel Trancoso
In Statistical Machine Translation, words that were not seen during training are unknown words, that is, words that the system will not know how to translate.
no code implementations • LREC 2012 • Ant{\'o}nio Branco, Catarina Carvalheiro, S{\'\i}lvia Pereira, Sara Silveira, Jo{\~a}o Silva, S{\'e}rgio Castro, Jo{\~a}o Gra{\c{c}}a
With the CINTIL-International Corpus of Portuguese, an ongoing corpus annotated with fully flegded grammatical representation, sentences get not only a high level of lexical, morphological and syntactic annotation but also a semantic analysis that prepares the data to a manual specification step and thus opens the way for a number of tools and resources for which there is a great research focus at the present.