Keyword Spotting on Google Speech Commands

2 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification Network

Alibaba-MIIL/AudioClassfication 25 Apr 2022

While efficient architectures and a plethora of augmentations for end-to-end image classification tasks have been suggested and heavily investigated, state-of-the-art techniques for audio classifications still rely on numerous representations of the audio signal together with large architectures, fine-tuned from large datasets.

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input

nttcslab/m2d 26 Oct 2022

We propose a new method, Masked Modeling Duo (M2D), that learns representations directly while obtaining training signals using only masked patches.