Search Results for author: Antonio Bonafonte

Found 15 papers, 7 papers with code

Controllable Emphasis with zero data for text-to-speech

no code implementations • 13 Jul 2023 • Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova

We show that this technique significantly closes the gap to methods that require explicit recordings.

Sentence

Paper
Add Code

Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue

no code implementations • 7 Dec 2022 • Daxin Tan, Nikos Kargas, David McHardy, Constantinos Papayiannis, Antonio Bonafonte, Marek Strelec, Jonas Rohnke, Agis Oikonomou Filandras, Trevor Wood

Entrainment is the phenomenon by which an interlocutor adapts their speaking style to align with their partner in conversations.

Spoken Dialogue Systems

Paper
Add Code

Distribution augmentation for low-resource expressive text-to-speech

no code implementations • 13 Feb 2022 • Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova

This paper presents a novel data augmentation technique for text-to-speech (TTS), that allows to generate new (text, audio) training examples without requiring any additional data.

Data Augmentation

Paper
Add Code

Discrete Acoustic Space for an Efficient Sampling in Neural Text-To-Speech

no code implementations • 24 Oct 2021 • Marek Strong, Jonas Rohnke, Antonio Bonafonte, Mateusz Łajszczak, Trevor Wood

We present a Split Vector Quantized Variational Autoencoder (SVQ-VAE) architecture using a split vector quantizer for NTTS, as an enhancement to the well-known Variational Autoencoder (VAE) and Vector Quantized Variational Autoencoder (VQ-VAE) architectures.

Paper
Add Code

Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

1 code implementation • NAACL 2021 • Shubhi Tyagi, Antonio Bonafonte, Jaime Lorenzo-Trueba, Javier Latorre

Developing Text Normalization (TN) systems for Text-to-Speech (TTS) on new languages is hard.

Paper
Code

Prosodic Phrase Alignment for Machine Dubbing

1 code implementation • 20 Aug 2019 • Alp Öktem, Mireia Farrús, Antonio Bonafonte

Dubbing is a type of audiovisual translation where dialogues are translated and enacted so that they give the impression that the media is in the target language.

Machine Translation Translation

Paper
Code

Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with SampleRNN

2 code implementations • 3 Jun 2019 • David Álvarez, Santiago Pascual, Antonio Bonafonte

This way we feed the acoustic model with speaker acoustically dependent representations that enrich the waveform generation more than discrete embeddings unrelated to these factors.

Sound Audio and Speech Processing

436

Paper
Code

Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

1 code implementation • 6 Apr 2019 • Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte, Yoshua Bengio

Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure.

Ranked #2 on Distant Speech Recognition on DIRHA English WSJ

Distant Speech Recognition

436

Paper
Code

Towards Generalized Speech Enhancement with Generative Adversarial Networks

no code implementations • 6 Apr 2019 • Santiago Pascual, Joan Serrà, Antonio Bonafonte

The speech enhancement task usually consists of removing additive noise or reverberation that partially mask spoken utterances, affecting their intelligibility.

Generative Adversarial Network Speech Enhancement

Paper
Add Code

Self-Attention Linguistic-Acoustic Decoder

no code implementations • 31 Aug 2018 • Santiago Pascual, Antonio Bonafonte, Joan Serrà

The conversion from text to speech relies on the accurate mapping from linguistic to acoustic symbol sequences, for which current practice employs recurrent statistical models like recurrent neural networks.

Decoder Speech Synthesis

Paper
Add Code

Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks

3 code implementations • 31 Aug 2018 • Santiago Pascual, Antonio Bonafonte, Joan Serrà, Jose A. Gonzalez

Most methods of voice restoration for patients suffering from aphonia either produce whispered or monotone speech.

Speech Enhancement

373

Paper
Code

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

3 code implementations • 18 Dec 2017 • Santiago Pascual, Maruchan Park, Joan Serrà, Antonio Bonafonte, Kang-Hun Ahn

In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data.

Generative Adversarial Network Speech Enhancement

373

Paper
Code

SEGAN: Speech Enhancement Generative Adversarial Network

20 code implementations • 28 Mar 2017 • Santiago Pascual, Antonio Bonafonte, Joan Serrà

In contrast to current techniques, we operate at the waveform level, training the model end-to-end, and incorporate 28 speakers and 40 different noise conditions into the same model, such that model parameters are shared across them.

Generative Adversarial Network Speech Enhancement

1,575

Paper
Code

Building Synthetic Voices in the META-NET Framework

no code implementations • LREC 2012 • Em{\'\i}lia Garcia Casademont, Antonio Bonafonte, Asunci{\'o}n Moreno

It is a project in the META-NET Network of Excellence, a cluster of projects aiming at fostering the mission of META, which is the Multilingual Europe Technology Alliance, dedicated to building the technological foundations of a multilingual European information society.

Speech Synthesis Voice Conversion

Paper
Add Code

BUCEADOR, a multi-language search engine for digital libraries

no code implementations • LREC 2012 • Jordi Adell, Antonio Bonafonte, Antonio Cardenal, Marta R. Costa-juss{\`a}, Jos{\'e} A. R. Fonollosa, Asunci{\'o}n Moreno, Eva Navas, Eduardo R. Banga

The paper presents the tool functionality, the architecture, the digital library and provide some information about the technology involved in the fields of automatic speech recognition, statistical machine translation, text-to-speech synthesis and information retrieval.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.