no code implementations • 3 Nov 2023 • Xinmeng Xu, Yuhong Yang, Weiping tu
To overcome this limitation, we introduce a strategy to map monaural speech into a fixed simulation space for better differentiation between target speech and noise.
no code implementations • 19 Sep 2023 • Hongyang Chen, Yuhong Yang, Qingmu Liu, Baifeng Li, Weiping tu, Song Lin
Then We compare natural and grid sentences in terms of Lombard effect and Normal-to-Lombard conversion using LCT and Enhanced MAndarin Lombard Grid corpus (EMALG).
no code implementations • 28 Jul 2023 • Xinmeng Xu, Weiping tu, Yuhong Yang
Convolutional neural networks (CNN) and Transformer have wildly succeeded in multimedia applications.
no code implementations • 26 Jul 2023 • Chang Han, Xinmeng Xu, Weiping tu, Yuhong Yang, Yajie Liu
We observe that besides target positive information, e. g., ground-truth speech and features, the target negative information, such as interference signals and features, helps make pattern of target speech and interference signals more discriminative.
no code implementations • 26 Apr 2023 • Xinmeng Xu, Weiping tu, Chang Han, Yuhong Yang
In this study, we propose a SE model that integrates both speech positive and negative information for improving SE performance by adopting contrastive learning, in which two innovations have consisted.
no code implementations • 7 Dec 2022 • Xinmeng Xu, Weiping tu, Yuhong Yang
Attention mechanisms, such as local and non-local attention, play a fundamental role in recent deep learning based speech enhancement (SE) systems.
no code implementations • 2 Dec 2022 • Xinmeng Xu, Weiping tu, Yuhong Yang
To address this issue, we inject spatial information into the monaural SE model and propose a knowledge distillation strategy to enable the monaural SE model to learn binaural speech features from the binaural SE model, which makes monaural SE model possible to reconstruct higher intelligibility and quality enhanced speeches under low signal-to-noise ratio (SNR) conditions.
1 code implementation • 27 Oct 2022 • Jingyi Li, Weiping tu, Li Xiao
Voice conversion (VC) can be achieved by first extracting source content information and target speaker information, and then reconstructing waveform with these information.