no code implementations • 25 Mar 2022 • Marion Weller-Di Marco, Matthias Huck, Alexander Fraser
Key challenges of rich target-side morphology in data-driven machine translation include: (1) A large amount of differently inflected word surface forms entails a larger vocabulary and thus data sparsity.
no code implementations • ACL 2020 • Marion Weller-Di Marco, Alex Fraser, er
This paper studies strategies to model word formation in NMT using rich linguistic information, namely a word segmentation approach that goes beyond splitting into substrings by considering fusional morphology.
no code implementations • WS 2017 • Aleš Tamchyna, Marion Weller-Di Marco, Alexander Fraser
NMT systems have problems with large vocabulary sizes.
no code implementations • WS 2017 • Marion Weller-Di Marco
This paper presents a simple method for German compound splitting that combines a basic frequency-based approach with a form-to-lemma mapping to approximate morphological operations.
no code implementations • EACL 2017 • Marion Weller-Di Marco, Alex Fraser, er, Sabine Schulte im Walde
Many errors in phrase-based SMT can be attributed to problems on three linguistic levels: morphological complexity in the target language, structural differences and lexical choice.