no code implementations • 2 Feb 2024 • María Sánchez, Laura Fernández, Julián Arias, Mateo Cámara, Giulia Comini, Adam Gabrys, José Luis Blanco, Juan Ignacio Godino, Luis Alfonso Hernández
We present a processing flow that, starting from images extracted from videos, is able to sound them.
no code implementations • 31 Jul 2023 • Giulia Comini, Manuel Sam Ribeiro, Fan Yang, Heereen Shim, Jaime Lorenzo-Trueba
Phonetic information and linguistic knowledge are an essential component of a Text-to-speech (TTS) front-end.
no code implementations • 31 Jul 2023 • Manuel Sam Ribeiro, Giulia Comini, Jaime Lorenzo-Trueba
The G2P model is used to train a multilingual phone recognition system, which then decodes speech recordings with a phonetic representation.
no code implementations • 29 Jul 2022 • Giulia Comini, Goeric Huybrechts, Manuel Sam Ribeiro, Adam Gabrys, Jaime Lorenzo-Trueba
The availability of data in expressive styles across languages is limited, and recording sessions are costly and time consuming.
no code implementations • 16 Feb 2022 • Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba
It uses voice conversion (VC) as a post-processing module appended to a pre-existing high-quality TTS system and marks a conceptual shift in the existing TTS paradigm, framing the few-shot TTS problem as a VC task.
no code implementations • 10 Feb 2022 • Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba
The proposed approach relies on voice conversion to first generate high-quality data from the set of supporting expressive speakers.
no code implementations • 11 Nov 2020 • Goeric Huybrechts, Thomas Merritt, Giulia Comini, Bartek Perz, Raahil Shah, Jaime Lorenzo-Trueba
While recent neural text-to-speech (TTS) systems perform remarkably well, they typically require a substantial amount of recordings from the target speaker reading in the desired speaking style.