no code implementations • 29 Apr 2024 • Nikita Drobyshev, Antoni Bigata Casademunt, Konstantinos Vougioukas, Zoe Landgraf, Stavros Petridis, Maja Pantic
Head avatars animated by visual signals have gained popularity, particularly in cross-driving synthesis where the driver differs from the animated character, a challenging but highly practical approach.
1 code implementation • 15 May 2023 • Antoni Bigata Casademunt, Rodrigo Mira, Nikita Drobyshev, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
Speech-driven animation has gained significant traction in recent years, with current methods achieving near-photorealistic results.
no code implementations • CVPR 2023 • Xubo Liu, Egor Lakomkin, Konstantinos Vougioukas, Pingchuan Ma, Honglie Chen, Ruiming Xie, Morrie Doulaty, Niko Moritz, Jáchym Kolář, Stavros Petridis, Maja Pantic, Christian Fuegen
Furthermore, when combined with large-scale pseudo-labeled audio-visual data SynthVSR yields a new state-of-the-art VSR WER of 16. 9% using publicly available data only, surpassing the recent state-of-the-art approaches trained with 29 times more non-public machine-transcribed video data (90, 000 hours).
no code implementations • 6 Jan 2023 • Michał Stypułkowski, Konstantinos Vougioukas, Sen He, Maciej Zięba, Stavros Petridis, Maja Pantic
Talking face generation has historically struggled to produce head movements and natural facial expressions without guidance from additional reference videos.
no code implementations • 27 Apr 2021 • Rodrigo Mira, Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Björn W. Schuller, Maja Pantic
In this work, we propose a new end-to-end video-to-speech model based on Generative Adversarial Networks (GANs) which translates spoken video to waveform end-to-end without using any intermediate representation or separate waveform synthesis algorithm.
1 code implementation • ICLR 2021 • Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
Domain translation is the process of transforming data from one domain to another while preserving the common semantics.
1 code implementation • CVPR 2021 • Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
Extensive experiments show that this simple approach significantly surpasses the state-of-the-art in terms of generalisation to unseen manipulations and robustness to perturbations, as well as shed light on the factors responsible for its performance.
Ranked #5 on DeepFake Detection on FakeAVCeleb
no code implementations • 13 Jan 2020 • Abhinav Shukla, Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Maja Pantic
Self supervised representation learning has recently attracted a lot of research interest for both the audio and visual modalities.
Ranked #8 on Speech Emotion Recognition on CREMA-D
no code implementations • 12 Dec 2019 • Triantafyllos Kefalas, Konstantinos Vougioukas, Yannis Panagakis, Stavros Petridis, Jean Kossaifi, Maja Pantic
Speech-driven facial animation involves using a speech signal to generate realistic videos of talking faces.
no code implementations • 14 Jun 2019 • Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Maja Pantic
Speech is a means of communication which relies on both audio and visual information.
no code implementations • 14 Jun 2019 • Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features.
1 code implementation • 23 May 2018 • Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
To the best of our knowledge, this is the first method capable of generating subject independent realistic videos directly from raw audio.