1 code implementation • 27 Feb 2024 • Cameron Churchwell, Max Morrison, Bryan Pardo
A phonetic posteriorgram (PPG) is a time-varying categorical distribution over acoustic units of speech (e. g., phonemes).
1 code implementation • 12 Oct 2023 • Max Morrison, Pranav Pawar, Nathan Pruyne, Jennifer Cole, Bryan Pardo
Speech prominence estimation is the process of assigning a numeric value to the prominence of each word in an utterance.
1 code implementation • 28 Jan 2023 • Max Morrison, Caedon Hsieh, Nathan Pruyne, Bryan Pardo
Pitch is a foundational aspect of our perception of audio signals.
no code implementations • 26 Aug 2022 • Noah Schaffer, Boaz Cogan, Ethan Manilow, Max Morrison, Prem Seetharaman, Bryan Pardo
Despite phenomenal progress in recent years, state-of-the-art music separation systems produce source estimates with significant perceptual shortcomings, such as adding extraneous noise or removing harmonics.
1 code implementation • 8 Mar 2022 • Max Morrison, Brian Tang, Gefei Tan, Bryan Pardo
ReSEval lets researchers launch A/B, ABX, Mean Opinion Score (MOS) and MUltiple Stimuli with Hidden Reference and Anchor (MUSHRA) tests on audio, image, text, or video data from a command-line interface or using one line of Python, making it as easy to run as objective evaluation.
1 code implementation • ICLR 2022 • Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio
We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression.
1 code implementation • 5 Oct 2021 • Max Morrison, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo
Modifying the pitch and timing of an audio signal are fundamental audio editing operations with applications in speech manipulation, audio-visual synchronization, and singing voice editing and synthesis.
no code implementations • 16 Feb 2021 • Max Morrison, Lucas Rencker, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo
Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript.
no code implementations • 7 Aug 2020 • Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore
Speech synthesis has recently seen significant improvements in fidelity, driven by the advent of neural vocoders and neural prosody generators.
no code implementations • 5 Nov 2019 • Max Morrison, Bryan Pardo
Many automobile components in need of repair produce characteristic sounds.