Search Results for author: Viet Anh Trinh

Found 10 papers, 1 papers with code

Paper
Add Code

Two-pass Endpoint Detection for Speech Recognition

no code implementations • 17 Jan 2024 • Anirudh Raju, Aparna Khare, Di He, Ilya Sklyar, Long Chen, Sam Alptekin, Viet Anh Trinh, Zhe Zhang, Colin Vaz, Venkatesh Ravichandran, Roland Maas, Ariya Rastrow

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands.

speech-recognition Speech Recognition

Paper
Add Code

Adaptive Endpointing with Deep Contextual Multi-armed Bandits

no code implementations • 23 Mar 2023 • Do June Min, Andreas Stolcke, Anirudh Raju, Colin Vaz, Di He, Venkatesh Ravichandran, Viet Anh Trinh

In this paper, we aim to provide a solution for adaptive endpointing by proposing an efficient method for choosing an optimal endpointing configuration given utterance-level audio features in an online setting, while avoiding hyperparameter grid-search.

Multi-Armed Bandits

Paper
Add Code

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation

no code implementations • 16 Jul 2022 • Viet Anh Trinh, Pegah Ghahremani, Brian King, Jasha Droppo, Andreas Stolcke, Roland Maas

A popular approach is to fine-tune the model with data from regions where the ASR model has a higher word error rate (WER).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

ImportantAug: a data augmentation agent for speech

1 code implementation • ICASSP 2022 • Viet Anh Trinh, Hassan Salami Kavaki, Michael I Mandel

We introduce ImportantAug, a technique to augment training data for speech classification and recognition models by adding noise to unimportant regions of the speech and not to important regions.

Ranked #1 on Keyword Spotting on Google Speech Commands (Google Speech Command-Musan metric)

Data Augmentation Keyword Spotting +1

Paper
Code

Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses

no code implementations • 16 Nov 2021 • Viet Anh Trinh, Sebastian Braun

Our results show that the proposed function effectively improves the speech enhancement performance compared to a baseline trained in a supervised way on the noisy VoxCeleb dataset.

Disentanglement Speech Enhancement +2

Paper
Add Code

Enhancement of Spatial Clustering-Based Time-Frequency Masks using LSTM Neural Networks

no code implementations • 2 Dec 2020 • Felix Grezes, Zhaoheng Ni, Viet Anh Trinh, Michael Mandel

By using LSTMs to enhance spatial clustering based time-frequency masks, we achieve both the signal modeling performance of multiple single-channel LSTM-DNN speech enhancers and the signal separation performance and generality of multi-channel spatial clustering.

Clustering Speech Enhancement

Paper
Add Code

Improved MVDR Beamforming Using LSTM Speech Models to Clean Spatial Clustering Masks

no code implementations • 2 Dec 2020 • Zhaoheng Ni, Felix Grezes, Viet Anh Trinh, Michael I. Mandel

Spatial clustering techniques can achieve significant multi-channel noise reduction across relatively arbitrary microphone configurations, but have difficulty incorporating a detailed speech/noise model.

Clustering

Paper
Add Code

Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement

no code implementations • 2 Dec 2020 • Felix Grezes, Zhaoheng Ni, Viet Anh Trinh, Michael Mandel

The system is compared to several baselines on the CHiME3 dataset in terms of speech quality predicted by the PESQ algorithm and word error rate of a recognizer trained on mis-matched conditions, in order to focus on generalization.

Clustering Speech Enhancement

Paper
Add Code

Large scale evaluation of importance maps in automatic speech recognition

no code implementations • 21 May 2020 • Viet Anh Trinh, Michael I Mandel

In this paper, we propose a metric that we call the structured saliency benchmark (SSBM) to evaluate importance maps computed for automatic speech recognizers on individual utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.