no code implementations • RaPID (LREC) 2022 • Birger Moell, Jim O’Regan, Shivam Mehta, Ambika Kirkland, Harm Lameris, Joakim Gustafson, Jonas Beskow
As part of the PSST challenge, we explore how data augmentations, data sources, and model size affect phoneme transcription accuracy on speech produced by individuals with aphasia.
no code implementations • EACL (VarDial) 2021 • Harm Lameris, Sara Stymne
We find that training on a very small amount of Scots data was superior to zero-shot transfer from English.
no code implementations • 24 Nov 2022 • Harm Lameris, Shivam Mehta, Gustav Eje Henter, Joakim Gustafson, Éva Székely
Spontaneous speech has many affective and pragmatic functions that are interesting and challenging to model in TTS.
2 code implementations • 13 Nov 2022 • Shivam Mehta, Ambika Kirkland, Harm Lameris, Jonas Beskow, Éva Székely, Gustav Eje Henter
Neural HMMs are a type of neural transducer recently proposed for sequence-to-sequence modelling in text-to-speech.
Ranked #11 on Text-To-Speech Synthesis on LJSpeech (using extra training data)