no code implementations • 23 Feb 2024 • Xavier Riley, Drew Edwards, Simon Dixon
Focusing on the guitar, we refine this approach to training on score data using a dataset of commercially available score-audio pairs.
no code implementations • 9 Feb 2024 • Yixiao Zhang, Yukara Ikemiya, Gus Xia, Naoki Murata, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Yuki Mitsufuji, Simon Dixon
This paper introduces a novel approach to the editing of music generated by such models, enabling the modification of specific attributes, such as genre, mood and instrument, while maintaining other aspects unchanged.
no code implementations • 2 Feb 2024 • Drew Edwards, Simon Dixon, Emmanouil Benetos, Akira Maezawa, Yuta Kusaka
Algorithms for automatic piano transcription have improved dramatically in recent years due to new datasets and modeling techniques.
1 code implementation • 19 Oct 2023 • Yixiao Zhang, Akira Maezawa, Gus Xia, Kazuhiko Yamamoto, Simon Dixon
Creating music is iterative, requiring varied methods at each stage.
1 code implementation • 5 Sep 2023 • huan zhang, Emmanouil Karystinaios, Simon Dixon, Gerhard Widmer, Carlos Eduardo Cancino-Chacón
Music Information Retrieval (MIR) has seen a recent surge in deep learning-based approaches, which often involve encoding symbolic music (i. e., music represented in terms of discrete note events) in an image-like or language like fashion.
no code implementations • 27 Feb 2023 • Brendan O'Connor, Simon Dixon
We propose an alternative loss component in a loss function that is otherwise well-established among VC tasks, which has been shown to improve our model's SVC performance.
1 code implementation • 24 Aug 2022 • Yixiao Zhang, Junyan Jiang, Gus Xia, Simon Dixon
Lyric interpretations can help people understand songs and their lyrics quickly, and can also make it easier to manage, retrieve and discover songs efficiently from the growing mass of music archives.
1 code implementation • 12 May 2022 • Yin-Jyun Luo, Sebastian Ewert, Simon Dixon
In this paper, we show that the vanilla DSAE suffers from being sensitive to the choice of model architecture and capacity of the dynamic latent variables, and is prone to collapse the static latent variable.
no code implementations • 19 Apr 2022 • Ruchit Agrawal, Daniel Wolff, Simon Dixon
Our method is also robust to structural differences between the performance and score sequences, which is a common limitation of standard alignment approaches.
2 code implementations • 5 Aug 2021 • Emir Demirel, Sven Ahlbäck, Simon Dixon
This paper makes several contributions to automatic lyrics transcription (ALT) research.
no code implementations • 28 Jul 2021 • Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck
We also include ablation studies investigating the effects of the use of multiple kernel shapes and comparing different input representations for the audio and the note-related information.
1 code implementation • 21 Jun 2021 • Emir Demirel, Sven Ahlback, Simon Dixon
Recent automatic lyrics transcription (ALT) approaches focus on building stronger acoustic models or in-domain language models, while the pronunciation aspect is seldom touched upon.
no code implementations • 31 Jan 2021 • Ruchit Agrawal, Daniel Wolff, Simon Dixon
The identification of structural differences between a music performance and the score is a challenging yet integral step of audio-to-score alignment, an important subtask of music information retrieval.
no code implementations • 3 Jan 2021 • Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlbäck, Patrik Ohlsson
This paper addresses the problem of domain adaptation for the task of music source separation.
no code implementations • 15 Nov 2020 • Ruchit Agrawal, Simon Dixon
Audio-to-score alignment aims at generating an accurate mapping between a performance audio and the score of a given piece.
no code implementations • 28 Jul 2020 • Ruchit Agrawal, Simon Dixon
Audio-to-score alignment aims at generating an accurate mapping between a performance audio and the score of a given piece.
2 code implementations • 13 Jul 2020 • Emir Demirel, Sven Ahlback, Simon Dixon
Speech recognition is a well developed research field so that the current state of the art systems are being used in many applications in the software industry, yet as by today, there still does not exist such robust system for the recognition of words and sentences from singing voice.
1 code implementation • 15 May 2020 • Saumitra Mishra, Emmanouil Benetos, Bob L. Sturm, Simon Dixon
One way to analyse the behaviour of machine learning models is through local explanations that highlight input features that maximally influence model predictions.
1 code implementation • 14 Nov 2019 • Daniel Stoller, Mi Tian, Sebastian Ewert, Simon Dixon
In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.
Ranked #2 on Music Modeling on Nottingham
1 code implementation • ICLR 2020 • Daniel Stoller, Sebastian Ewert, Simon Dixon
We apply our method to image generation, image segmentation and audio source separation, and obtain improved performance over a standard GAN when additional incomplete training examples are available.
no code implementations • 21 Apr 2019 • Saumitra Mishra, Daniel Stoller, Emmanouil Benetos, Bob L. Sturm, Simon Dixon
However, this requires a careful selection of hyper-parameters to generate interpretable examples for each neuron of interest, and current methods rely on a manual, qualitative evaluation of each setting, which is prohibitively slow.
9 code implementations • 8 Jun 2018 • Daniel Stoller, Sebastian Ewert, Simon Dixon
Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end.
Ranked #27 on Music Source Separation on MUSDB18
no code implementations • 5 Apr 2018 • Daniel Stoller, Sebastian Ewert, Simon Dixon
A main challenge in applying deep learning to music processing is the availability of training data.
3 code implementations • 31 Oct 2017 • Daniel Stoller, Sebastian Ewert, Simon Dixon
Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.
no code implementations • 23 Mar 2017 • Eita Nakamura, Kazuyoshi Yoshii, Simon Dixon
This paper presents a statistical method for use in music transcription that can estimate score times of note onsets and offsets from polyphonic MIDI performance signals.
1 code implementation • 7 Aug 2015 • Siddharth Sigtia, Emmanouil Benetos, Simon Dixon
We compare performance of the neural network based acoustic models with two popular unsupervised acoustic models.
no code implementations • 6 Nov 2014 • Siddharth Sigtia, Emmanouil Benetos, Nicolas Boulanger-Lewandowski, Tillman Weyde, Artur S. d'Avila Garcez, Simon Dixon
We investigate the problem of incorporating higher-level symbolic score-like information into Automatic Music Transcription (AMT) systems to improve their performance.
no code implementations • 9 Jul 2014 • Peter Foster, Simon Dixon, Anssi Klapuri
This paper investigates methods for quantifying similarity between audio signals, specifically for the task of of cover song detection.
no code implementations • 27 Feb 2014 • Peter Foster, Matthias Mauch, Simon Dixon
To verify that our descriptors capture musically relevant information, we incorporate our descriptors into similarity rating prediction and song year prediction tasks.