no code implementations • NAACL (BEA) 2022 • Rricha Jalota, Peter Bourgonje, Jan Van Sas, Huiyan Huang
The role of an author’s L1 in SLA can be challenging for automated CEFR classification, in that texts from different L1 groups may be too heterogeneous to combine them as training data.
no code implementations • 27 Mar 2024 • Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil
Based on this insight and leveraging the design of our production models, we introduce a new architecture for World English NNLM that meets the accuracy, latency, and memory constraints of our single-dialect models.
1 code implementation • 17 Jan 2024 • David Thulke, Yingbo Gao, Petrus Pelser, Rein Brune, Rricha Jalota, Floris Fok, Michael Ramos, Ian van Wyk, Abdallah Nasir, Hayden Goldstein, Taylor Tragemann, Katie Nguyen, Ariana Fowler, Andrew Stanco, Jon Gabriel, Jordan Taylor, Dean Moro, Evgenii Tsymbalov, Juliette de Waal, Evgeny Matusov, Mudar Yaghi, Mohammad Shihadah, Hermann Ney, Christian Dugast, Jonathan Dotan, Daniel Erasmus
To increase the accessibility of our model to non-English speakers, we propose to make use of cascaded machine translation and show that this approach can perform comparably to natively multilingual models while being easier to scale to a large number of languages.
no code implementations • 28 Oct 2023 • Rricha Jalota, Koel Dutta Chowdhury, Cristina España-Bonet, Josef van Genabith
We show how we can eliminate the need for parallel validation data by combining the self-supervised loss with an unsupervised loss.
1 code implementation • NAACL 2022 • Koel Dutta Chowdhury, Rricha Jalota, Cristina España-Bonet, Josef van Genabith
Cross-lingual natural language processing relies on translation, either by humans or machines, at different levels, from translating training data to translating test sets.
1 code implementation • 11 Mar 2021 • Daniel Vollmers, Rricha Jalota, Diego Moussallem, Hardik Topiwala, Axel-Cyrille Ngonga Ngomo, Ricardo Usbeck
In this paper, we present a novel QA approach, dubbed TeBaQA.
1 code implementation • 4 Aug 2020 • Hamada M. Zahera, Rricha Jalota, Mohamed A. Sherif, Axel N. Ngomo
In this paper, we propose I-AID, a multimodel approach to automatically categorize tweets into multi-label information types and filter critical information from the enormous volume of social media data.