Search Results for author: K R Prajwal

Found 9 papers, 5 papers with code

A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision

no code implementations • 16 May 2024 • Charles Raude, K R Prajwal, Liliane Momeni, Hannah Bull, Samuel Albanie, Andrew Zisserman, Gül Varol

To this end, we introduce a multi-task Transformer model, CSLR2, that is able to ingest a signing sequence and output in a joint embedding space between signed language and spoken language text.

Retrieval Sign Language Recognition +1

Paper
Add Code

Weakly-supervised Fingerspelling Recognition in British Sign Language Videos

1 code implementation • 16 Nov 2022 • K R Prajwal, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman

Through extensive evaluations, we verify our method for automatic annotation and our model architecture.

Paper
Code

Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild

no code implementations • 1 Sep 2022 • Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, C. V. Jawahar

With the help of multiple powerful discriminators that guide the training process, our generator learns to synthesize speech sequences in any voice for the lip movements of any person.

Lip to Speech Synthesis Speech Synthesis

Paper
Add Code

Automatic dense annotation of large-vocabulary sign language videos

no code implementations • 4 Aug 2022 • Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman

Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training data.

Paper
Add Code

Visual Keyword Spotting with Attention

1 code implementation • 29 Oct 2021 • K R Prajwal, Liliane Momeni, Triantafyllos Afouras, Andrew Zisserman

In this paper, we consider the task of spotting spoken keywords in silent video sequences -- also known as visual keyword spotting.

Ranked #1 on Visual Keyword Spotting on LRS2

Lip Reading Visual Keyword Spotting

Paper
Code

Sub-word Level Lip Reading With Visual Attention

no code implementations • CVPR 2022 • K R Prajwal, Triantafyllos Afouras, Andrew Zisserman

To this end, we make the following contributions: (1) we propose an attention-based pooling mechanism to aggregate visual speech representations; (2) we use sub-word units for lip reading for the first time and show that this allows us to better model the ambiguities of the task; (3) we propose a model for Visual Speech Detection (VSD), trained on top of the lip reading network.

Ranked #1 on Visual Speech Recognition on LRS2 (using extra training data)

Audio-Visual Active Speaker Detection Automatic Speech Recognition +5

Paper
Add Code

Visual Speech Enhancement Without A Real Visual Stream

1 code implementation • 20 Dec 2020 • Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we re-think the task of speech enhancement in unconstrained real-world environments.

Ranked #1 on Speech Denoising on LRS3+VGGSound

Denoising Speech Denoising +1

100

Paper
Code

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

4 code implementations • 23 Aug 2020 • K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

Ranked #1 on Unconstrained Lip-synchronization on LRS3 (using extra training data)

MORPH Unconstrained Lip-synchronization

9,499

Paper
Code

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

1 code implementation • CVPR 2020 • K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.

Ranked #1 on Lip Reading on LRW

Lip Reading Speaker-Specific Lip to Speech Synthesis +1

686

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.