1 code implementation • 10 Apr 2024 • Hanyu Meng, Vidhyasaharan Sethu, Eliathamby Ambikairajah
There is increasing interest in the use of the LEArnable Front-end (LEAF) in a variety of speech processing systems.
no code implementations • 17 Oct 2023 • Antoni Dimitriadis, Siqi Pan, Vidhyasaharan Sethu, Beena Ahmed
Spatial HuBERT learns representations that outperform state-of-the-art single-channel speech representations on a variety of spatial downstream tasks, particularly in reverberant and noisy environments.
no code implementations • 21 Sep 2023 • Zheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed
Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences.
no code implementations • 10 Aug 2021 • Jingyao Wu, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah
We propose a Markovian framework referred to as Dynamic Ordinal Markov Model (DOMM) that makes use of both absolute and relative ordinal information, to improve speech based ordinal emotion prediction.
no code implementations • 1 Sep 2019 • Vidhyasaharan Sethu, Emily Mower Provost, Julien Epps, Carlos Busso, NIcholas Cummins, Shrikanth Narayanan
A key reason for this is the lack of a common mathematical framework to describe all the relevant elements of emotion representations.