no code implementations • IEEE Access 2022 • JENNIFER SANTOSO, Takeshi Yamada, Kenkichi Ishizuka, Taiichi Hashimoto, Shoji Makino
Although there is a method to improve ASR performance in the presence of emotional speech, it requires the fine-tuning of ASR, which has a high computational cost and leads to the loss of cues important for determining the presence of emotion in speech segments, which can be helpful in SER.
Ranked #4 on Multimodal Emotion Recognition on IEMOCAP
Multimodal Emotion Recognition Speech Emotion Recognition +2
no code implementations • 7 Jul 2022 • Jumon Nozaki, Tatsuya Kawahara, Kenkichi Ishizuka, Taiichi Hashimoto
We also propose to incorporate an auxiliary loss to train the model using the output of the intermediate layer and unpunctuated texts.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3