Search Results for author: Anton Ragni

Found 17 papers, 6 papers with code

Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users using Intermediate ASR Features and Human Memory Models

no code implementations • 24 Jan 2024 • Rhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni

Neural networks have been successfully used for non-intrusive speech intelligibility prediction.

Decoder

Paper
Add Code

How Much Context Does My Attention-Based ASR System Need?

1 code implementation • 24 Oct 2023 • Robert Flynn, Anton Ragni

For the task of speech recognition, the use of more than 30 seconds of acoustic context during training is uncommon, and under-investigated in literature.

speech-recognition Speech Recognition

Paper
Code

Energy-Based Models For Speech Synthesis

no code implementations • 19 Oct 2023 • Wanli Sun, Zehai Tu, Anton Ragni

It also describes how sampling from EBMs can be performed using Langevin Markov Chain Monte-Carlo (MCMC).

Speech Synthesis

Paper
Add Code

On the Effectiveness of Speech Self-supervised Learning for Music

no code implementations • 11 Jul 2023 • Yinghao Ma, Ruibin Yuan, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Ruibo Liu, Gus Xia, Roger Dannenberg, Yike Guo, Jie Fu

Our findings suggest that training with music data can generally improve performance on MIR tasks, even when models are trained using paradigms designed for speech.

Information Retrieval Music Information Retrieval +2

Paper
Add Code

Leveraging Cross-Utterance Context For ASR Decoding

no code implementations • 29 Jun 2023 • Robert Flynn, Anton Ragni

While external language models (LMs) are often incorporated into the decoding stage of automated speech recognition systems, these models usually operate with limited context.

speech-recognition Speech Recognition

Paper
Add Code

MARBLE: Music Audio Representation Benchmark for Universal Evaluation

1 code implementation • NeurIPS 2023 • Ruibin Yuan, Yinghao Ma, Yizhi Li, Ge Zhang, Xingran Chen, Hanzhi Yin, Le Zhuo, Yiqi Liu, Jiawen Huang, Zeyue Tian, Binyue Deng, Ningzhi Wang, Chenghua Lin, Emmanouil Benetos, Anton Ragni, Norbert Gyenge, Roger Dannenberg, Wenhu Chen, Gus Xia, Wei Xue, Si Liu, Shi Wang, Ruibo Liu, Yike Guo, Jie Fu

This is evident in the limited work on deep music representations, the scarcity of large-scale datasets, and the absence of a universal and community-driven benchmark.

Image Generation Information Retrieval +1

Paper
Code

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

1 code implementation • 31 May 2023 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Xingran Chen, Hanzhi Yin, Chenghao Xiao, Chenghua Lin, Anton Ragni, Emmanouil Benetos, Norbert Gyenge, Roger Dannenberg, Ruibo Liu, Wenhu Chen, Gus Xia, Yemin Shi, Wenhao Huang, Zili Wang, Yike Guo, Jie Fu

Although SSL has been proven effective in speech and audio, its application to music audio has yet to be thoroughly explored.

Language Modelling Quantization +1

252

Paper
Code

MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning

no code implementations • 5 Dec 2022 • Yizhi Li, Ruibin Yuan, Ge Zhang, Yinghao Ma, Chenghua Lin, Xingran Chen, Anton Ragni, Hanzhi Yin, Zhijie Hu, Haoyu He, Emmanouil Benetos, Norbert Gyenge, Ruibo Liu, Jie Fu

The deep learning community has witnessed an exponentially growing interest in self-supervised learning (SSL).

Representation Learning Self-Supervised Learning

Paper
Add Code

HERB: Measuring Hierarchical Regional Bias in Pre-trained Language Models

1 code implementation • 5 Nov 2022 • Yizhi Li, Ge Zhang, Bohao Yang, Chenghua Lin, Shi Wang, Anton Ragni, Jie Fu

In addition to verifying the existence of regional bias in LMs, we find that the biases on regional groups can be strongly influenced by the geographical clustering of the groups.

Fairness

Paper
Code

Approximate Fixed-Points in Recurrent Neural Networks

no code implementations • 4 Jun 2021 • Zhengxiong Wang, Anton Ragni

Although exact fixed-points inherit the same parallelization and inconsistency issues, this paper shows that approximate fixed-points can be computed in parallel and used consistently in training and inference including tasks such as lattice rescoring.

Paper
Add Code

Continuous representations of intents for dialogue systems

no code implementations • 8 May 2021 • Sindre André Jacobsen, Anton Ragni

Finally, this paper will show how the proposed model can be augmented with unseen intents without retraining any of the seen ones.

Intent Detection Zero-Shot Learning

Paper
Add Code

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

2 code implementations • 25 Oct 2019 • Alexandros Kastanos, Anton Ragni, Mark Gales

This paper examines this limited resource scenario for confidence estimation, a measure commonly used to assess transcription reliability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks

no code implementations • 30 Oct 2018 • Anton Ragni, Qiujia Li, Mark Gales, Yu Wang

These errors are not accounted for by the standard confidence estimation schemes and are hard to rectify in the upstream and downstream processing.

Paper
Add Code

Bi-Directional Lattice Recurrent Neural Networks for Confidence Estimation

4 code implementations • 30 Oct 2018 • Qiujia Li, Preben Ness, Anton Ragni, Mark Gales

The standard approach to mitigate errors made by an automatic speech recognition system is to use confidence scores associated with each predicted word.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription

no code implementations • 1 Feb 2018 • Yu Wang, Xie Chen, Mark Gales, Anton Ragni, Jeremy Wong

As the combination approaches become more complicated the difference between the phonetic and graphemic systems further decreases.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Future Word Contexts in Neural Network Language Models

no code implementations • 18 Aug 2017 • Xie Chen, Xunying Liu, Anton Ragni, Yu Wang, Mark Gales

Instead of using a recurrent unit to capture the complete future word contexts, a feedforward unit is used to model a finite number of succeeding, future, words.

speech-recognition Speech Recognition

Paper
Add Code

Incorporating Uncertainty into Deep Learning for Spoken Language Assessment

no code implementations • ACL 2017 • Andrey Malinin, Anton Ragni, Kate Knill, Mark Gales

On experiments conducted on data from the Business Language Testing Service (BULATS), the proposed approach is found to outperform GPs and DNNs with MCD in uncertainty-based rejection whilst achieving comparable grading performance.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.