Search Results for author: Zhongjun He

Found 32 papers, 9 papers with code

Non-Autoregressive Chinese ASR Error Correction with Phonological Training

no code implementations • NAACL 2022 • Zheng Fang, Ruiqing Zhang, Zhongjun He, Hua Wu, Yanan Cao

Automatic Speech Recognition (ASR) is an efficient and widely used input method that transcribes speech signals into text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Learning Adaptive Segmentation Policy for Simultaneous Translation

no code implementations • EMNLP 2020 • Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang

The policy learns to segment the source text by considering possible translations produced by the translation model, maintaining consistency between the segmentation and translation.

Segmentation Translation

Paper
Add Code

Correcting Chinese Spelling Errors with Phonetic Pre-training

no code implementations • Findings (ACL) 2021 • Ruiqing Zhang, Chao Pang, Chuanqiang Zhang, Shuohuan Wang, Zhongjun He, Yu Sun, Hua Wu, Haifeng Wang

Paper
Add Code

Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation

no code implementations • ACL 2022 • Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang

End-to-end simultaneous speech-to-text translation aims to directly perform translation from streaming source speech to target text with high translation quality and low latency.

Segmentation Simultaneous Speech-to-Text Translation +1

Paper
Add Code

Findings of the Second Workshop on Automatic Simultaneous Translation

no code implementations • NAACL (AutoSimTrans) 2021 • Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang

This paper presents the results of the shared task of the 2nd Workshop on Automatic Simultaneous Translation (AutoSimTrans).

Machine Translation Translation

Paper
Add Code

Findings of the Third Workshop on Automatic Simultaneous Translation

no code implementations • NAACL (AutoSimTrans) 2022 • Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang, Liang Huang, Qun Liu, Julia Ive, Wolfgang Macherey

This paper reports the results of the shared task we hosted on the Third Workshop of Automatic Simultaneous Translation (AutoSimTrans).

Translation

Paper
Add Code

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models

1 code implementation • 11 Jan 2024 • Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs.

Machine Translation NMT +1

Paper
Code

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation

1 code implementation • 28 Aug 2023 • Pengzhi Gao, Ruiqing Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Consistency regularization methods, such as R-Drop (Liang et al., 2021) and CrossConST (Gao et al., 2023), have achieved impressive supervised and zero-shot performance in the neural machine translation (NMT) field.

Machine Translation NMT +2

Paper
Code

Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization

1 code implementation • 12 Jun 2023 • Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

Multilingual sentence representations are the foundation for similarity-based bitext mining, which is crucial for scaling multilingual neural machine translation (NMT) system to more languages.

Decoder Machine Translation +3

Paper
Code

Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization

1 code implementation • 12 May 2023 • Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang

The experimental analysis also proves that CrossConST could close the sentence representation gap and better align the representation space.

Machine Translation NMT +2

Paper
Code

Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation

1 code implementation • NAACL 2022 • Pengzhi Gao, Zhongjun He, Hua Wu, Haifeng Wang

We introduce Bi-SimCut: a simple but effective training strategy to boost neural machine translation (NMT) performance.

Ranked #1 on Machine Translation on WMT2014 German-English

Machine Translation NMT +2

Paper
Code

Mixup Decoding for Diverse Machine Translation

no code implementations • Findings (EMNLP) 2021 • Jicheng Li, Pengzhi Gao, Xuanfu Wu, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

To further improve the faithfulness and diversity of the translations, we propose two simple but effective approaches to select diverse sentence pairs in the training corpus and adjust the interpolation weight for each pair correspondingly.

Machine Translation Sentence +1

Paper
Add Code

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

no code implementations • NAACL (AutoSimTrans) 2021 • Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

This corpus is expected to promote the research of automatic simultaneous translation as well as the development of practical systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Knowledge Distillation based Ensemble Learning for Neural Machine Translation

no code implementations • 1 Jan 2021 • Chenze Shao, Meng Sun, Yang Feng, Zhongjun He, Hua Wu, Haifeng Wang

Under this framework, we introduce word-level ensemble learning and sequence-level ensemble learning for neural machine translation, where sequence-level ensemble learning is capable of aggregating translation models with different decoding strategies.

Ensemble Learning Knowledge Distillation +2

Paper
Add Code

Simultaneous Translation

no code implementations • EMNLP 2020 • Liang Huang, Colin Cherry, Mingbo Ma, Naveen Arivazhagan, Zhongjun He

Simultaneous translation, which performs translation concurrently with the source speech, is widely useful in many scenarios such as international conferences, negotiations, press releases, legal proceedings, and medicine.

Machine Translation speech-recognition +3

Paper
Add Code

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding

1 code implementation • 16 Dec 2019 • Yuchen Liu, Jiajun Zhang, Hao Xiong, Long Zhou, Zhongjun He, Hua Wu, Haifeng Wang, Cheng-qing Zong

Speech-to-text translation (ST), which translates source language speech into target language text, has attracted intensive attention in recent years.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Multi-agent Learning for Neural Machine Translation

no code implementations • IJCNLP 2019 • Tianchi Bi, Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang

Conventional Neural Machine Translation (NMT) models benefit from the training with an additional agent, e. g., dual learning, and bidirectional decoding with one agent decoding from left to right and the other decoding in the opposite direction.

Machine Translation NMT +1

Paper
Add Code

Baidu Neural Machine Translation Systems for WMT19

no code implementations • WS 2019 • Meng Sun, Bojian Jiang, Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang

In this paper we introduce the systems Baidu submitted for the WMT19 shared task on Chinese{\textless}-{\textgreater}English news translation.

Data Augmentation Domain Adaptation +4

Paper
Add Code

DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting

no code implementations • 30 Jul 2019 • Hao Xiong, Ruiqing Zhang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Haifeng Wang

In this paper, we present DuTongChuan, a novel context-aware translation model for simultaneous interpreting.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

End-to-End Speech Translation with Knowledge Distillation

no code implementations • 17 Apr 2019 • Yuchen Liu, Hao Xiong, Zhongjun He, Jiajun Zhang, Hua Wu, Haifeng Wang, Cheng-qing Zong

End-to-end speech translation (ST), which directly translates from source language speech into target language text, has attracted intensive attentions in recent years.

Knowledge Distillation speech-recognition +2

Paper
Add Code

Modeling Coherence for Discourse Neural Machine Translation

no code implementations • 14 Nov 2018 • Hao Xiong, Zhongjun He, Hua Wu, Haifeng Wang

Discourse coherence plays an important role in the translation of one text.

Machine Translation Sentence +1

Paper
Add Code

STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework

3 code implementations • ACL 2019 • Mingbo Ma, Liang Huang, Hao Xiong, Renjie Zheng, Kaibo Liu, Baigong Zheng, Chuanqiang Zhang, Zhongjun He, Hairong Liu, Xing Li, Hua Wu, Haifeng Wang

Simultaneous translation, which translates sentences before they are finished, is useful in many scenarios but is notoriously difficult due to word-order differences.

Sentence Translation

11,538

Paper
Code

Robust Neural Machine Translation with Joint Textual and Phonetic Embedding

no code implementations • ACL 2019 • Hairong Liu, Mingbo Ma, Liang Huang, Hao Xiong, Zhongjun He

Neural machine translation (NMT) is notoriously sensitive to noises, but noises are almost inevitable in practice.

Automatic Speech Recognition (ASR) Machine Translation +2

Paper
Add Code

Addressing Troublesome Words in Neural Machine Translation

no code implementations • EMNLP 2018 • Yang Zhao, Jiajun Zhang, Zhongjun He, Cheng-qing Zong, Hua Wu

One of the weaknesses of Neural Machine Translation (NMT) is in handling lowfrequency and ambiguous words, which we refer as troublesome words.

Machine Translation NMT +1

Paper
Add Code

Multi-channel Encoder for Neural Machine Translation

no code implementations • 6 Dec 2017 • Hao Xiong, Zhongjun He, Xiaoguang Hu, Hua Wu

This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN.

Decoder Machine Translation +3

Paper
Add Code

Semi-Supervised Learning for Neural Machine Translation

no code implementations • ACL 2016 • Yong Cheng, Wei Xu, Zhongjun He, wei he, Hua Wu, Maosong Sun, Yang Liu

While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation.

Decoder Machine Translation +2