Search Results for author: Marcos Garcia

Found 20 papers, 6 papers with code

Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning

no code implementations • CL (ACL) 2021 • Marcos Garcia

Paper
Add Code

A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time

1 code implementation • 6 Jun 2022 • Iria de-Dios-Flores, Marcos Garcia

This paper explores the ability of Transformer models to capture subject-verb and noun-adjective agreement dependencies in Galician.

Paper
Code

SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding

1 code implementation • SemEval (NAACL) 2022 • Harish Tayyar Madabushi, Edward Gow-Smith, Marcos Garcia, Carolina Scarton, Marco Idiart, Aline Villavicencio

This paper presents the shared task on Multilingual Idiomaticity Detection and Sentence Embedding, which consists of two subtasks: (a) a binary classification task aimed at identifying whether a sentence contains an idiomatic expression, and (b) a task based on semantic text similarity which requires the model to adequately represent potentially idiomatic expressions in context.

Binary Classification Sentence +4

Paper
Code

Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels

1 code implementation • ACL 2021 • Marcos Garcia, Tiago Kramer Vieira, Carolina Scarton, Marco Idiart, Aline Villavicencio

This paper presents the Noun Compound Type and Token Idiomaticity (NCTTI) dataset, with human annotations for 280 noun compounds in English and 180 in Portuguese at both type and token level.

Vocal Bursts Type Prediction

Paper
Code

Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy

1 code implementation • ACL 2021 • Marcos Garcia

We assess the ability of both static and contextualized models to adequately represent different lexical-semantic relations, such as homonymy and synonymy.

Paper
Code

Probing for idiomaticity in vector space models

1 code implementation • EACL 2021 • Marcos Garcia, Tiago Kramer Vieira, Carolina Scarton, Marco Idiart, Aline Villavicencio

Contextualised word representation models have been successfully used for capturing different word usages and they may be an attractive alternative for representing idiomaticity in language.

Paper
Code

Bertinho: Galician BERT Representations

no code implementations • 25 Mar 2021 • David Vilares, Marcos Garcia, Carlos Gómez-Rodríguez

The experiments show that our models, especially the 12-layer one, outperform the results of mBERT in most tasks.

Dependency Parsing named-entity-recognition +4

Paper
Add Code

A Method to Automatically Identify Diachronic Variation in Collocations.

no code implementations • WS 2019 • Marcos Garcia, Marcos Garc{\'\i}a Salido

This paper introduces a novel method to track collocational variations in diachronic corpora that can identify several changes undergone by these phraseological combinations and to propose alternative solutions found in later periods.

Paper
Add Code

Unsupervised Compositional Translation of Multiword Expressions

no code implementations • WS 2019 • Pablo Gamallo, Marcos Garcia

This article describes a dependency-based strategy that uses compositional distributional semantics and cross-lingual word embeddings to translate multiword expressions (MWEs).

Cross-Lingual Word Embeddings Translation +1

Paper
Add Code

A comparison of statistical association measures for identifying dependency-based collocations in various languages.

no code implementations • WS 2019 • Marcos Garcia, Marcos Garc{\'\i}a Salido, Margarita Alonso-Ramos

This paper presents an exploration of different statistical association measures to automatically identify collocations from corpora in English, Portuguese, and Spanish.

Paper
Add Code

Pay Attention when you Pay the Bills. A Multilingual Corpus with Dependency-based and Semantic Annotation of Collocations.

no code implementations • ACL 2019 • Marcos Garcia, Marcos Garc{\'\i}a Salido, Susana Sotelo, Estela Mosqueira, Margarita Alonso-Ramos

This paper presents a new multilingual corpus with semantic annotation of collocations in English, Portuguese, and Spanish.

Natural Language Understanding Text Generation

Paper
Add Code

Towards Syntactic Iberian Polarity Classification

1 code implementation • WS 2017 • David Vilares, Marcos Garcia, Miguel A. Alonso, Carlos Gómez-Rodríguez

Lexicon-based methods using syntactic rules for polarity classification rely on parsers that are dependent on the language and on treebank guidelines.

Classification General Classification

Paper
Code

A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies

no code implementations • CONLL 2017 • Marcos Garcia, Pablo Gamallo

We also compare our system with a delexicalized parser for Romance languages, and take advantage of the harmonized annotation of Universal Dependencies to propose a language ranking based on the syntactic distance each variety has from Romance languages.

Dependency Parsing POS +1

Paper
Add Code

Using bilingual word-embeddings for multilingual collocation extraction

no code implementations • WS 2017 • Marcos Garcia, Marcos Garc{\'\i}a-Salido, Margarita Alonso-Ramos

This paper presents a new strategy for multilingual collocation extraction which takes advantage of parallel corpora to learn bilingual word-embeddings.

Machine Translation Translation +1

Paper
Add Code

A Web Interface for Diachronic Semantic Search in Spanish

no code implementations • EACL 2017 • Pablo Gamallo, Iv{\'a}n Rodr{\'\i}guez-Torres, Marcos Garcia

This article describes a semantic system which is based on distributional models obtained from a chronologically structured language resource, namely Google Books Syntactic Ngrams. The models were created using dependency-based contexts and a strategy for reducing the vector space, which consists in selecting the more informative and relevant word contexts.

Paper
Add Code

Incorporating Lexico-semantic Heuristics into Coreference Resolution Sieves for Named Entity Recognition at Document-level

no code implementations • LREC 2016 • Marcos Garcia

This paper explores the incorporation of lexico-semantic heuristics into a deterministic Coreference Resolution (CR) system for classifying named entities at document-level.

coreference-resolution named-entity-recognition +2