no code implementations • 1 Mar 2024 • Weiwei Lin, Chenhang He, Man-Wai Mak, Jiachen Lian, Kong Aik Lee
This forces the model to learn a speaker distribution disentangled from the semantic content.
no code implementations • 27 Nov 2023 • Zezhong Jin, Youzhi Tu, Man-Wai Mak
The intuition is that phonetic information can preserve low-level acoustic dynamics with speaker information and thus partly compensate for the degradation due to noise and reverberation.
no code implementations • 23 Sep 2023 • Youzhi Tu, Man-Wai Mak, Jen-Tzung Chien
Contrastive speaker embedding assumes that the contrast between the positive and negative pairs of speech segments is attributed to speaker identity only.
no code implementations • 8 Sep 2023 • Chong-Xin Gan, Man-Wai Mak, Weiwei Lin, Jen-Tzung Chien
Contrastive self-supervised learning (CSL) for speaker verification (SV) has drawn increasing interest recently due to its ability to exploit unlabeled data.
1 code implementation • 18 Aug 2023 • Chongkai Lu, Man-Wai Mak, Ruimin Li, Zheru Chi, Hong Fu
The framework locates actions in videos by detecting the action evolution process.
no code implementations • 14 May 2023 • Weiwei Lin, Chenhang He, Man-Wai Mak, Youzhi Tu
Self-supervised learning (SSL) speech models such as wav2vec and HuBERT have demonstrated state-of-the-art performance on automatic speech recognition (ASR) and proved to be extremely useful in low label-resource settings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 28 Mar 2023 • Haiquan Mao, Feng Hong, Man-Wai Mak
Inspired by the self-training strategies that use an existing classifier to label the unlabeled data for retraining, we propose a cluster-guided UDA framework that labels the target domain data by clustering and combines the labeled source domain data and pseudo-labeled target domain data to train a speaker embedding network.
no code implementations • 29 Oct 2022 • Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng
The challenges in applying contrastive learning to speaker verification (SV) are that the softmax-based contrastive loss lacks discriminative power and that the hard negative pairs can easily influence learning.
1 code implementation • 29 Oct 2022 • Zhe Li, Man-Wai Mak
A great challenge in speaker representation learning using deep models is to design learning objectives that can enhance the discrimination of unseen speakers under unseen domains.
no code implementations • 8 Aug 2017 • Shibiao Wan, Man-Wai Mak, Sun-Yuan Kung
In the post-genomic era, large-scale personal DNA sequences are produced and collected for genetic medical diagnoses and new drug discovery, which, however, simultaneously poses serious challenges to the protection of personal genomic privacy.