Singing Voice Synthesis
19 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Singing Voice Synthesis
Most implemented papers
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Singing voice synthesis (SVS) systems are built to synthesize high-quality and expressive singing voice, in which the acoustic model generates the acoustic features (e. g., mel-spectrogram) given a music score.
MLP Singer: Towards Rapid Parallel Singing Voice Synthesis
Recent developments in deep learning have significantly improved the quality of synthesized singing voice audio.
Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus
High-fidelity multi-singer singing voice synthesis is challenging for neural vocoder due to the singing voice data shortage, limited singer generalization, and large computational cost.
NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit
This paper describes the design of NNSVS, an open-source software for neural network-based singing voice synthesis research.
Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables
This paper introduces GlOttal-flow LPC Filter (GOLF), a novel method for singing voice synthesis (SVS) that exploits the physical characteristics of the human voice using differentiable digital signal processing.
Score and Lyrics-Free Singing Voice Generation
Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.
HiFiSinger: Towards High-Fidelity Neural Singing Voice Synthesis
To tackle the difficulty of singing modeling caused by high sampling rate (wider frequency band and longer waveform), we introduce multi-scale adversarial training in both the acoustic model and vocoder to improve singing modeling.
Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss
The neural network (NN) based singing voice synthesis (SVS) systems require sufficient data to train well and are prone to over-fitting due to data scarcity.
Latent Space Explorations of Singing Voice Synthesis using DDSP
In this work we present a lightweight architecture, based on the Differentiable Digital Signal Processing (DDSP) library, that is able to output song-like utterances conditioned only on pitch and amplitude, after twelve hours of training using small datasets of unprocessed audio.
Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System
To better model a singing voice, the proposed system incorporates improved approaches to modeling pitch and vibrato and better training criteria into the acoustic model.