1 code implementation • 1 Jan 2024 • Ruizhuo Xu, Linzhi Huang, Mei Wang, Jiani Hu, Weihong Deng
In this paper, we show that using high-level contextualized features as prediction targets can achieve superior performance.
no code implementations • 1 Jan 2024 • Ruizhuo Xu, Ke Wang, Chao Deng, Mei Wang, Xi Chen, Wenhui Huang, Junlan Feng, Weihong Deng
With the increasing availability of consumer depth sensors, 3D face recognition (FR) has attracted more and more attention.
no code implementations • 22 Apr 2022 • Lin Yao, Jianfei Song, Ruizhuo Xu, Yingfang Yang, Zijian Chen, Yafeng Deng
Basically, there are two main methods for SLU tasks: (1) Two-stage method, which uses a speech model to transfer speech to text, then uses a language model to get the results of downstream tasks; (2) One-stage method, which just fine-tunes a pre-trained speech model to fit in the downstream tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5