Search Results for author: Xingyu Cai

Found 14 papers, 2 papers with code

Training on Lexical Resources

1 code implementation • LREC 2022 • Kenneth Church, Xingyu Cai, Yuchen Bian

We propose using lexical resources (thesaurus, VAD) to fine-tune pretrained deep nets such as BERT and ERNIE.

Paper
Code

Improving Bilingual Lexicon Induction for Low Frequency Words

no code implementations • EMNLP 2020 • Jiaji Huang, Xingyu Cai, Kenneth Church

This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of bilingual lexicon induction for rare words.

Bilingual Lexicon Induction

Paper
Add Code

Data Augmentation for the Post-Stroke Speech Transcription (PSST) Challenge: Sometimes Less Is More

no code implementations • RaPID (LREC) 2022 • Jiahong Yuan, Xingyu Cai, Kenneth Church

The result represents a relative error reduction of 14% over the baseline model trained without data augmentation.

Data Augmentation Language Modelling

Paper
Add Code

Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models

no code implementations • 27 Feb 2024 • Rohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno

In the present work, we study one such strategy: applying multiple frame reduction layers in the encoder to compress encoder outputs into a small number of output frames.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Massive End-to-end Models for Short Search Queries

no code implementations • 22 Sep 2023 • Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar

In this work, we investigate two popular end-to-end automatic speech recognition (ASR) models, namely Connectionist Temporal Classification (CTC) and RNN-Transducer (RNN-T), for offline recognition of voice search queries, with up to 2B model parameters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Data-Driven Adaptive Simultaneous Machine Translation

no code implementations • 27 Apr 2022 • Guangxu Xun, Mingbo Ma, Yuchen Bian, Xingyu Cai, Jiaji Huang, Renjie Zheng, Junkun Chen, Jiahong Yuan, Kenneth Church, Liang Huang

In simultaneous translation (SimulMT), the most widely used strategy is the wait-k policy thanks to its simplicity and effectiveness in balancing translation quality and latency.

Machine Translation Sentence +1

Paper
Add Code

W-CTC: a Connectionist Temporal Classification Loss with Wild Cards

no code implementations • ICLR 2022 • Xingyu Cai, Jiahong Yuan, Yuchen Bian, Guangxu Xun, Jiaji Huang, Kenneth Church

Standard CTC computes a loss by aggregating over all possible alignment paths, that map the entire sequence to the entire label (full alignment).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

The Role of Phonetic Units in Speech Emotion Recognition

no code implementations • 2 Aug 2021 • Jiahong Yuan, Xingyu Cai, Renjie Zheng, Liang Huang, Kenneth Church

Models of phonemes, broad phonetic classes, and syllables all significantly outperform the utterance model, demonstrating that phonetic units are helpful and should be incorporated in speech emotion recognition.

Speech Emotion Recognition speech-recognition +1

Paper
Add Code

Decoupling recognition and transcription in Mandarin ASR

no code implementations • 2 Aug 2021 • Jiahong Yuan, Xingyu Cai, Dongji Gao, Renjie Zheng, Liang Huang, Kenneth Church

Much of the recent literature on automatic speech recognition (ASR) is taking an end-to-end approach.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Automatic recognition of suprasegmentals in speech

no code implementations • 2 Aug 2021 • Jiahong Yuan, Neville Ryant, Xingyu Cai, Kenneth Church, Mark Liberman

This study reports our efforts to improve automatic recognition of suprasegmentals by fine-tuning wav2vec 2. 0 with CTC, a method that has been successful in automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

On Attention Redundancy: A Comprehensive Study

no code implementations • NAACL 2021 • Yuchen Bian, Jiaji Huang, Xingyu Cai, Jiahong Yuan, Kenneth Church

(What) We define and focus the study on redundancy matrices generated from pre-trained and fine-tuned BERT-base model for GLUE datasets.

Model Compression Sentence

Paper
Add Code

Better than BERT but Worse than Baseline

no code implementations • 12 May 2021 • Boxiang Liu, Jiaji Huang, Xingyu Cai, Kenneth Church

This paper compares BERT-SQuAD and Ab3P on the Abbreviation Definition Identification (ADI) task.

Paper
Add Code

Isotropy in the Contextual Embedding Space: Clusters and Manifolds

no code implementations • ICLR 2021 • Xingyu Cai, Jiaji Huang, Yuchen Bian, Kenneth Church

We hope the study in this paper could provide insights towards a better understanding of the deep language models.

Paper
Add Code

DTWNet: a Dynamic Time Warping Network

1 code implementation • NeurIPS 2019 • Xingyu Cai, Tingyang Xu, Jin-Feng Yi, Junzhou Huang, Sanguthevar Rajasekaran

Dynamic Time Warping (DTW) is widely used as a similarity measure in various domains.

Dynamic Time Warping

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.