no code implementations • 18 Dec 2023 • Guru Prakash Arumugam, Shuo-Yiin Chang, Tara N. Sainath, Rohit Prabhavalkar, Quan Wang, Shaan Bijwadia
ASR models often suffer from a long-form deletion problem where the model predicts sequential blanks instead of words when transcribing a lengthy audio (in the order of minutes or hours).
no code implementations • 14 Aug 2023 • Shaan Bijwadia, Shuo-Yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath
Text injection for automatic speech recognition (ASR), wherein unpaired text-only data is used to supplement paired audio-text data, has shown promising improvements for word error rate.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Nov 2022 • Shaan Bijwadia, Shuo-Yiin Chang, Bo Li, Tara Sainath, Chao Zhang, Yanzhang He
In this work, we propose a method to jointly train the ASR and EP tasks in a single end-to-end (E2E) multitask model, improving EP quality by optionally leveraging information from the ASR audio encoder.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1