Search Results for author: Zakaria Aldeneh

Found 12 papers, 3 papers with code

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

2 code implementations • 30 Jan 2024 • Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.

Ranked #1 on Speaker Verification on VoxCeleb (using extra training data)

Self-Supervised Learning Speaker Recognition +1

7,914

Paper
Code

Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning

1 code implementation • 18 Aug 2023 • Miguel Sarabia, Elena Menyaylenko, Alessandro Toso, Skyler Seto, Zakaria Aldeneh, Shadi Pirhosseinloo, Luca Zappella, Barry-John Theobald, Nicholas Apostoloff, Jonathan Sheaffer

We present Spatial LibriSpeech, a spatial audio dataset with over 650 hours of 19-channel audio, first-order ambisonics, and optional distractor noise.

8k Position

Paper
Code

Naturalistic Head Motion Generation from Speech

no code implementations • 26 Oct 2022 • Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald

Synthesizing natural head motion to accompany speech for an embodied conversational agent is necessary for providing a rich interactive experience.

Paper
Add Code

On the role of Lip Articulation in Visual Speech Perception

no code implementations • 18 Mar 2022 • Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald

Previous research has shown that traditional metrics used to optimize and assess models for generating lip motion from speech are not a good indicator of subjective opinion of animation quality.

Paper
Add Code

Learning Paralinguistic Features from Audiobooks through Style Voice Conversion

no code implementations • NAACL 2021 • Zakaria Aldeneh, Matthew Perez, Emily Mower Provost

Paralinguistics, the non-lexical components of speech, play a crucial role in human-human interaction.

Emotion Recognition Voice Conversion

Paper
Add Code

On the Role of Visual Cues in Audiovisual Speech Enhancement

no code implementations • 25 Apr 2020 • Zakaria Aldeneh, Anushree Prasanna Kumar, Barry-John Theobald, Erik Marchi, Sachin Kajarekar, Devang Naik, Ahmed Hussen Abdelaziz

One byproduct of this finding is that the learned visual embeddings can be used as features for other visual speech applications.

Self-Supervised Learning Speech Enhancement

Paper
Add Code

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews

no code implementations • 29 Sep 2019 • Zakaria Aldeneh, Mimansa Jaiswal, Michael Picheny, Melvin McInnis, Emily Mower Provost

Bipolar disorder, a severe chronic mental illness characterized by pathological mood swings from depression to mania, requires ongoing symptom severity tracking to both guide and measure treatments that are critical for maintaining long-term health.

Paper
Add Code

Controlling for Confounders in Multimodal Emotion Classification via Adversarial Learning

no code implementations • 23 Aug 2019 • Mimansa Jaiswal, Zakaria Aldeneh, Emily Mower Provost

Our results show that stress is indeed encoded in trained emotion classifiers and that this encoding varies across levels of emotions and across the lexical and acoustic modalities.

Classification Emotion Classification +2

Paper
Add Code

MuSE-ing on the Impact of Utterance Ordering On Crowdsourced Emotion Annotations

no code implementations • 27 Mar 2019 • Mimansa Jaiswal, Zakaria Aldeneh, Cristian-Paul Bara, Yuanhang Luo, Mihai Burzo, Rada Mihalcea, Emily Mower Provost

As a result, annotations are colored by the manner in which they were collected.

Emotion Recognition

Paper
Add Code

Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task

no code implementations • 9 May 2018 • Zakaria Aldeneh, Dimitrios Dimitriadis, Emily Mower Provost

This work focuses on the use of acoustic cues for modeling turn-taking in dyadic spoken dialogues.

Paper
Add Code

Capturing Long-term Temporal Dependencies with Convolutional Networks for Continuous Emotion Recognition

no code implementations • 23 Aug 2017 • Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis, Melvin McInnis, Emily Mower Provost

The goal of continuous emotion recognition is to assign an emotion value to every frame in a sequence of acoustic features.

Emotion Recognition

Paper
Add Code

Progressive Neural Networks for Transfer Learning in Emotion Recognition

1 code implementation • 10 Jun 2017 • John Gideon, Soheil Khorram, Zakaria Aldeneh, Dimitrios Dimitriadis, Emily Mower Provost

Many paralinguistic tasks are closely related and thus representations learned in one domain can be leveraged for another.

Emotion Recognition Transfer Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.