1 code implementation • LREC 2022 • Kenneth Church, Xingyu Cai, Yuchen Bian
We propose using lexical resources (thesaurus, VAD) to fine-tune pretrained deep nets such as BERT and ERNIE.
no code implementations • EMNLP 2020 • Jiaji Huang, Xingyu Cai, Kenneth Church
This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of bilingual lexicon induction for rare words.
no code implementations • RaPID (LREC) 2022 • Jiahong Yuan, Xingyu Cai, Kenneth Church
The result represents a relative error reduction of 14% over the baseline model trained without data augmentation.
no code implementations • 27 Feb 2024 • Rohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno
In the present work, we study one such strategy: applying multiple frame reduction layers in the encoder to compress encoder outputs into a small number of output frames.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Sep 2023 • Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar
In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model parameters.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 27 Apr 2022 • Guangxu Xun, Mingbo Ma, Yuchen Bian, Xingyu Cai, Jiaji Huang, Renjie Zheng, Junkun Chen, Jiahong Yuan, Kenneth Church, Liang Huang
In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency.
no code implementations • ICLR 2022 • Xingyu Cai, Jiahong Yuan, Yuchen Bian, Guangxu Xun, Jiaji Huang, Kenneth Church
Standard CTC computes a loss by aggregating over all possible alignment paths, that map the entire sequence to the entire label (full alignment).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 2 Aug 2021 • Jiahong Yuan, Xingyu Cai, Renjie Zheng, Liang Huang, Kenneth Church
Models of phonemes, broad phonetic classes, and syllables all significantly outperform the utterance model, demonstrating that phonetic units are helpful and should be incorporated in speech emotion recognition.
no code implementations • 2 Aug 2021 • Jiahong Yuan, Xingyu Cai, Dongji Gao, Renjie Zheng, Liang Huang, Kenneth Church
Much of the recent literature on automatic speech recognition (ASR) is taking an end-to-end approach.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Aug 2021 • Jiahong Yuan, Neville Ryant, Xingyu Cai, Kenneth Church, Mark Liberman
This study reports our efforts to improve automatic recognition of suprasegmentals by fine-tuning wav2vec 2. 0 with CTC, a method that has been successful in automatic speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • NAACL 2021 • Yuchen Bian, Jiaji Huang, Xingyu Cai, Jiahong Yuan, Kenneth Church
(What) We define and focus the study on redundancy matrices generated from pre-trained and fine-tuned BERT-base model for GLUE datasets.
no code implementations • 12 May 2021 • Boxiang Liu, Jiaji Huang, Xingyu Cai, Kenneth Church
This paper compares BERT-SQuAD and Ab3P on the Abbreviation Definition Identification (ADI) task.
no code implementations • ICLR 2021 • Xingyu Cai, Jiaji Huang, Yuchen Bian, Kenneth Church
We hope the study in this paper could provide insights towards a better understanding of the deep language models.
1 code implementation • NeurIPS 2019 • Xingyu Cai, Tingyang Xu, Jin-Feng Yi, Junzhou Huang, Sanguthevar Rajasekaran
Dynamic Time Warping (DTW) is widely used as a similarity measure in various domains.