Search Results for author: Haris Jabbar

Found 5 papers, 3 papers with code

An Information-Theoretic Approach and Dataset for Probing Gender Stereotypes in Multilingual Masked Language Models

1 code implementation • Findings (NAACL) 2022 • Victor Steinborn, Philipp Dufter, Haris Jabbar, Hinrich Schuetze

Bias research in NLP is a rapidly growing and developing field.

Paper
Code

WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data

1 code implementation • NeurIPS 2023 • Maurice Weber, Carlo Siebenschuh, Rory Butler, Anton Alexandrov, Valdemar Thanner, Georgios Tsolakis, Haris Jabbar, Ian Foster, Bo Li, Rick Stevens, Ce Zhang

Together with the pipeline, we will additionally release 9. 5M urls to word documents which can be processed using WordScape to create a dataset of over 40M pages.

document understanding Question Answering +1

Paper
Code

MorphPiece : A Linguistic Tokenizer for Large Language Models

no code implementations • 14 Jul 2023 • Haris Jabbar

Tokenization is a critical part of modern NLP pipelines.

Language Modelling

Paper
Add Code

Flow-Adapter Architecture for Unsupervised Machine Translation

no code implementations • ACL 2022 • Yihong Liu, Haris Jabbar, Hinrich Schütze

The primary novelties of our model are: (a) capturing language-specific sentence representations separately for each language using normalizing flows and (b) using a simple transformation of these latent representations for translating from one language to another.

NMT Sentence +2

Paper
Add Code

Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments

1 code implementation • Findings (ACL) 2022 • Antonis Maronikolakis, Axel Wisiorek, Leah Nann, Haris Jabbar, Sahana Udupa, Hinrich Schuetze

Building on current work on multilingual hate speech (e. g., Ousidhoum et al. (2019)) and hate speech reduction (e. g., Sap et al. (2020)), we present XTREMESPEECH, a new hate speech dataset containing 20, 297 social media passages from Brazil, Germany, India and Kenya.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.