Search Results for author: Georgios Tzimiropoulos

Found 81 papers, 39 papers with code

VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning

2 code implementations • 10 Apr 2024 • Alexandros Xenos, Niki Maria Foteinopoulou, Ioanna Ntinou, Ioannis Patras, Georgios Tzimiropoulos

In the first stage, we propose prompting VLLMs to generate descriptions in natural language of the subject's apparent emotion relative to the visual context.

Ranked #1 on Emotion Recognition in Context on EMOTIC

Common Sense Reasoning Emotion Classification +1

213

Paper
Code

DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment

no code implementations • 25 Mar 2024 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

To this end, in this paper we present DiffusionAct, a novel method that leverages the photo-realistic image generation of diffusion models to perform neural face reenactment.

Face Reenactment Image Generation

Paper
Add Code

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition

3 code implementations • 13 Mar 2024 • Zhonglin Sun, Chen Feng, Ioannis Patras, Georgios Tzimiropoulos

This enables our method - namely LAndmark-based Facial Self-supervised learning LAFS), to learn key representation that is more critical for face recognition.

Face Recognition Self-Supervised Learning

202

Paper
Code

One-shot Neural Face Reenactment via Finding Directions in GAN's Latent Space

no code implementations • 5 Feb 2024 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces.

Disentanglement Face Reenactment

Paper
Add Code

You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation

no code implementations • 30 Jan 2024 • Mehdi Noroozi, Isma Hadji, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos

We show that the combination of spatially distilled U-Net and fine-tuned decoder outperforms state-of-the-art methods requiring 200 steps with only one single step.

Decoder Image Super-Resolution

Paper
Add Code

Graph Guided Question Answer Generation for Procedural Question-Answering

no code implementations • 24 Jan 2024 • Hai X. Pham, Isma Hadji, Xinnuo Xu, Ziedune Degutyte, Jay Rainey, Evangelos Kazakos, Afsaneh Fazly, Georgios Tzimiropoulos, Brais Martinez

The key technological enabler is a novel mechanism for automatic question-answer generation from procedural text which can ingest large amounts of textual instructions and produce exhaustive in-domain QA training data.

Answer Generation Question-Answer-Generation +1

Paper
Add Code

Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization

no code implementations • 29 Dec 2023 • Ioanna Ntinou, Enrique Sanchez, Georgios Tzimiropoulos

These methods build on adding a DETR head with learnable queries that, after cross- and self-attention can be sent to corresponding MLPs for detecting a person's bounding box and action.

Action Localization

Paper
Add Code

A Simple Baseline for Knowledge-Based Visual Question Answering

no code implementations • 20 Oct 2023 • Alexandros Xenos, Themos Stafylakis, Ioannis Patras, Georgios Tzimiropoulos

This paper is on the problem of Knowledge-Based Visual Question Answering (KB-VQA).

Ranked #5 on Visual Question Answering (VQA) on A-OKVQA (DA VQA Score metric)

In-Context Learning Question Answering +1

Paper
Add Code

PRE: Vision-Language Prompt Learning with Reparameterization Encoder

1 code implementation • 14 Sep 2023 • Anh Pham Thi Minh, An Duc Nguyen, Georgios Tzimiropoulos

In this work, we present Prompt Learning with Reparameterization Encoder (PRE) - a simple and efficient method that enhances the generalization ability of the learnable prompt to unseen classes while maintaining the capacity to learn Base classes.

Ranked #1 on Few-Shot Image Classification on Caltech101

Few-Shot Image Classification Prompt Engineering

Paper
Code

SimDETR: Simplifying self-supervised pretraining for DETR

no code implementations • 28 Jul 2023 • Ioannis Maniadis Metaxas, Adrian Bulat, Ioannis Patras, Brais Martinez, Georgios Tzimiropoulos

DETR-based object detectors have achieved remarkable performance but are sample-inefficient and exhibit slow convergence.

Few-Shot Object Detection Object +2

Paper
Add Code

HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and Retarget Faces

1 code implementation • ICCV 2023 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

In this paper, we present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity, driven by a target facial pose.

Face Reenactment

Paper
Code

Black Box Few-Shot Adaptation for Vision-Language models

1 code implementation • ICCV 2023 • Yassine Ouali, Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos

Vision-Language (V-L) models trained with contrastive learning to align the visual and language modalities have been shown to be strong few-shot learners.

Contrastive Learning Re-Ranking

Paper
Code

DivClust: Controlling Diversity in Deep Clustering

1 code implementation • CVPR 2023 • Ioannis Maniadis Metaxas, Georgios Tzimiropoulos, Ioannis Patras

Clustering has been a major research topic in the field of machine learning, one to which Deep Learning has recently been applied with significant success.

Clustering Deep Clustering

Paper
Code

ReGen: A good Generative Zero-Shot Video Classifier Should be Rewarded

no code implementations • ICCV 2023 • Adrian Bulat, Enrique Sanchez, Brais Martinez, Georgios Tzimiropoulos

Specifically, we propose ReGen, a novel reinforcement learning based framework with a three-fold objective and reward functions: (1) a class-level discrimination reward that enforces the generated caption to be correctly classified into the corresponding action class, (2) a CLIP reward that encourages the generated caption to continue to be descriptive of the input video (i. e. video-specific), and (3) a grammar reward that preserves the grammatical correctness of the caption.

Action Classification Action Recognition +4

Paper
Add Code

Part-based Face Recognition with Vision Transformers

1 code implementation • 30 Nov 2022 • Zhonglin Sun, Georgios Tzimiropoulos

Holistic methods using CNNs and margin-based losses have dominated research on face recognition.

Face Recognition

Paper
Code

FS-DETR: Few-Shot DEtection TRansformer with prompting and without re-training

no code implementations • ICCV 2023 • Adrian Bulat, Ricardo Guerrero, Brais Martinez, Georgios Tzimiropoulos

Importantly, we show that our system is not only more flexible than existing methods, but also, it makes a step towards satisfying desideratum (c).

Few-Shot Object Detection object-detection +1

Paper
Add Code

Bayesian Prompt Learning for Image-Language Model Generalization

1 code implementation • ICCV 2023 • Mohammad Mahdi Derakhshani, Enrique Sanchez, Adrian Bulat, Victor Guilherme Turrisi da Costa, Cees G. M. Snoek, Georgios Tzimiropoulos, Brais Martinez

Our approach regularizes the prompt space, reduces overfitting to the seen prompts and improves the prompt generalization on unseen prompts.

Ranked #1 on Few-Shot Learning on food101

Few-Shot Learning Language Modelling +3

Paper
Code

LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision & Language Models

1 code implementation • CVPR 2023 • Adrian Bulat, Georgios Tzimiropoulos

Through evaluations on 11 datasets, we show that our approach (a) significantly outperforms all prior works on soft prompting, and (b) matches and surpasses, for the first time, the accuracy on novel classes obtained by hand-crafted prompts and CLIP for 8 out of 11 test datasets.

Few-Shot Learning Language Modelling +3

Paper
Code

REST: REtrieve & Self-Train for generative action recognition

no code implementations • 29 Sep 2022 • Adrian Bulat, Enrique Sanchez, Brais Martinez, Georgios Tzimiropoulos

We evaluate REST on the problem of zero-shot action recognition where we show that our approach is very competitive when compared to contrastive learning-based methods.

Action Recognition Caption Generation +5

Paper
Add Code

StyleMask: Disentangling the Style Space of StyleGAN2 for Neural Face Reenactment

1 code implementation • 27 Sep 2022 • Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios Tzimiropoulos

In this paper we address the problem of neural face reenactment, where, given a pair of a source and a target facial image, we need to transfer the target's pose (defined as the head pose and its facial expressions) to the source image, by preserving at the same time the source's identity characteristics (e. g., facial shape, hair style, etc), even in the challenging case where the source and the target faces belong to different identities.

Disentanglement Face Reenactment

108

Paper
Code

Efficient Attention-free Video Shift Transformers

no code implementations • 23 Aug 2022 • Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos

To address this gap, in this paper, we make the following contributions: (a) we construct a highly efficient \& accurate attention-free block based on the shift operator, coined Affine-Shift block, specifically designed to approximate as closely as possible the operations in the MHSA block of a Transformer layer.

Action Recognition Video Recognition

Paper
Add Code

iBoot: Image-bootstrapped Self-Supervised Video Representation Learning

no code implementations • 16 Jun 2022 • Fatemeh Saleh, Fuwen Tan, Adrian Bulat, Georgios Tzimiropoulos, Brais Martinez

Video self-supervised learning (SSL) suffers from added challenges: video datasets are typically not as large as image datasets, compute is an order of magnitude larger, and the amount of spurious patterns the optimizer has to sieve through is multiplied several fold.

Data Augmentation Representation Learning +1

Paper
Add Code

ContraCLIP: Interpretable GAN generation driven by pairs of contrasting sentences

1 code implementation • 5 Jun 2022 • Christos Tzelepis, James Oldfield, Georgios Tzimiropoulos, Ioannis Patras

This work addresses the problem of discovering non-linear interpretable paths in the latent space of pre-trained GANs in a model-agnostic manner.

Position

Paper
Code

From Keypoints to Object Landmarks via Self-Training Correspondence: A novel approach to Unsupervised Landmark Discovery

2 code implementations • 31 May 2022 • Dimitrios Mallis, Enrique Sanchez, Matt Bell, Georgios Tzimiropoulos

This paper proposes a novel paradigm for the unsupervised learning of object landmark detectors.

Contrastive Learning Image Generation

Paper
Code

Knowledge Distillation Meets Open-Set Semi-Supervised Learning

1 code implementation • 13 May 2022 • Jing Yang, Xiatian Zhu, Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos

The key idea is that we leverage the teacher's classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions.

Face Recognition Knowledge Distillation

Paper
Code

EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers

1 code implementation • 6 May 2022 • Junting Pan, Adrian Bulat, Fuwen Tan, Xiatian Zhu, Lukasz Dudziak, Hongsheng Li, Georgios Tzimiropoulos, Brais Martinez

In this work, pushing further along this under-studied direction we introduce EdgeViTs, a new family of light-weight ViTs that, for the first time, enable attention-based vision models to compete with the best light-weight CNNs in the tradeoff between accuracy and on-device efficiency.

Paper
Code

Finding Directions in GAN's Latent Space for Neural Face Reenactment

1 code implementation • 31 Jan 2022 • Stella Bounareli, Vasileios Argyriou, Georgios Tzimiropoulos

Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces.

Disentanglement Face Reenactment

Paper
Code

SSR: An Efficient and Robust Framework for Learning with Unknown Label Noise

1 code implementation • 22 Nov 2021 • Chen Feng, Georgios Tzimiropoulos, Ioannis Patras

Under this setting, unlike previous methods that often introduce multiple assumptions and lead to complex solutions, we propose a simple, efficient and robust framework named Sample Selection and Relabelling(SSR), that with a minimal number of hyperparameters achieves SOTA results in various conditions.

Ranked #1 on Image Classification on CIFAR-10 (with noisy labels)

Learning with noisy labels Self-Supervised Learning +1

Paper
Code

Subpixel Heatmap Regression for Facial Landmark Localization

no code implementations • 3 Nov 2021 • Adrian Bulat, Enrique Sanchez, Georgios Tzimiropoulos

Deep Learning models based on heatmap regression have revolutionized the task of facial landmark localization with existing models working robustly under large poses, non-uniform illumination and shadows, occlusions and self-occlusions, low resolution and blur.

Ranked #1 on Face Alignment on WFW (Extra Data) (using extra training data)

Face Alignment regression

Paper
Add Code

Defensive Tensorization

no code implementations • 26 Oct 2021 • Adrian Bulat, Jean Kossaifi, Sourav Bhattacharya, Yannis Panagakis, Timothy Hospedales, Georgios Tzimiropoulos, Nicholas D Lane, Maja Pantic

We propose defensive tensorization, an adversarial defence technique that leverages a latent high-order factorization of the network.

Audio Classification Image Classification

Paper
Add Code

SAIC_Cambridge-HuPBA-FBK Submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021

no code implementations • 6 Oct 2021 • Swathikiran Sudhakaran, Adrian Bulat, Juan-Manuel Perez-Rua, Alex Falcon, Sergio Escalera, Oswald Lanz, Brais Martinez, Georgios Tzimiropoulos

This report presents the technical details of our submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021.

Action Recognition Temporal Action Localization

Paper
Add Code

WarpedGANSpace: Finding non-linear RBF paths in GAN latent space

1 code implementation • ICCV 2021 • Christos Tzelepis, Georgios Tzimiropoulos, Ioannis Patras

This work addresses the problem of discovering, in an unsupervised manner, interpretable paths in the latent space of pretrained GANs, so as to provide an intuitive and easy way of controlling the underlying generative factors.

108

Paper
Code

Space-time Mixing Attention for Video Transformer

1 code implementation • NeurIPS 2021 • Adrian Bulat, Juan-Manuel Perez-Rua, Swathikiran Sudhakaran, Brais Martinez, Georgios Tzimiropoulos

In this work, we propose a Video Transformer model the complexity of which scales linearly with the number of frames in the video sequence and hence induces no overhead compared to an image-based Transformer model.

Ranked #32 on Action Classification on Kinetics-600

Action Classification Action Recognition In Videos +1

Paper
Code

Bit-Mixer: Mixed-precision networks with runtime bit-width selection

no code implementations • ICCV 2021 • Adrian Bulat, Georgios Tzimiropoulos

In this work, we propose Bit-Mixer, the very first method to train a meta-quantized network where during test time any layer can change its bid-width without affecting at all the overall network's ability for highly accurate inference.

AutoML Binarization +1

Paper
Add Code

Pre-training strategies and datasets for facial representation learning

2 code implementations • 30 Mar 2021 • Adrian Bulat, Shiyang Cheng, Jing Yang, Andrew Garbett, Enrique Sanchez, Georgios Tzimiropoulos

Recent work on Deep Learning in the area of face analysis has focused on supervised learning for specific tasks of interest (e. g. face recognition, facial landmark localization etc.)

Ranked #1 on Facial Expression Recognition (FER) on BP4D

3D Face Reconstruction 3D Facial Landmark Localization +11

420

Paper
Code

Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition

no code implementations • CVPR 2021 • Enrique Sanchez, Mani Kumar Tellamekala, Michel Valstar, Georgios Tzimiropoulos

Temporal context is key to the recognition of expressions of emotion.

Arousal Estimation Emotion Recognition +2

Paper
Add Code

Improving memory banks for unsupervised learning with large mini-batch, consistency and hard negative mining

no code implementations • 8 Feb 2021 • Adrian Bulat, Enrique Sánchez-Lozano, Georgios Tzimiropoulos

An important component of unsupervised learning by instance-based discrimination is a memory bank for storing a feature representation for each training sample in the dataset.

Paper
Add Code

Knowledge distillation via softmax regression representation learning

no code implementations • ICLR 2021 • Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos

We advocate for a method that optimizes the output feature of the penultimate layer of the student network and hence is directly related to representation learning.

Knowledge Distillation Model Compression +2

Paper
Add Code

Unsupervised Learning of Object Landmarks via Self-Training Correspondence

1 code implementation • NeurIPS 2020 • Dimitrios Mallis, Enrique Sanchez, Matthew Bell, Georgios Tzimiropoulos

This paper addresses the problem of unsupervised discovery of object landmarks.

Clustering Object +1

Paper
Code

Semi-supervised Facial Action Unit Intensity Estimation with Contrastive Learning

no code implementations • 3 Nov 2020 • Enrique Sanchez, Adrian Bulat, Anestis Zaganidis, Georgios Tzimiropoulos

The second stage uses another dataset of randomly chosen labeled frames to train a regressor on top of our spatio-temporal model for estimating the AU intensity.

Contrastive Learning Unsupervised Pre-training

Paper
Add Code

High-Capacity Expert Binary Networks

1 code implementation • ICLR 2021 • Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos

Network binarization is a promising hardware-aware direction for creating efficient deep models.

Ranked #2 on Classification with Binary Neural Network on ImageNet

Binarization Classification with Binary Neural Network +1

Paper
Code

A Transfer Learning approach to Heatmap Regression for Action Unit intensity estimation

no code implementations • 14 Apr 2020 • Ioanna Ntinou, Enrique Sanchez, Adrian Bulat, Michel Valstar, Georgios Tzimiropoulos

Action Units (AUs) are geometrically-based atomic facial muscle movements known to produce appearance changes at specific facial locations.

Face Alignment regression +1

Paper
Add Code

Training Binary Neural Networks with Real-to-Binary Convolutions

1 code implementation • ICLR 2020 • Brais Martinez, Jing Yang, Adrian Bulat, Georgios Tzimiropoulos

This paper shows how to train binary networks to within a few percent points ($\sim 3-5 \%$) of the full precision counterpart.

Ranked #2 on Classification with Binary Neural Network on CIFAR-100

Binarization Classification with Binary Neural Network

Paper
Code

Knowledge distillation via adaptive instance normalization

no code implementations • 9 Mar 2020 • Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos

To this end, we propose a new knowledge distillation method based on transferring feature statistics, specifically the channel-wise mean and variance, from the teacher to the student.

Knowledge Distillation Model Compression

Paper
Add Code

BATS: Binary ArchitecTure Search

no code implementations • ECCV 2020 • Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos

We show that directly applying NAS to the binary domain provides very poor results.

Ranked #1 on Classification with Binary Neural Network on CIFAR-10 (using extra training data)

Binarization Classification with Binary Neural Network +1

Paper
Add Code

Toward fast and accurate human pose estimation via soft-gated skip connections

3 code implementations • 25 Feb 2020 • Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic

In addition, with a reduction of 3x in model size and complexity, we show no decrease in performance when compared to the original HourGlass network.

Ranked #2 on Pose Estimation on Leeds Sports Poses

Pose Estimation

Paper
Code

Towards Pose-invariant Lip-Reading

no code implementations • 14 Nov 2019 • Shiyang Cheng, Pingchuan Ma, Georgios Tzimiropoulos, Stavros Petridis, Adrian Bulat, Jie Shen, Maja Pantic

The proposed model significantly outperforms previous approaches on non-frontal views while retaining the superior performance on frontal and near frontal mouth views.

Lip Reading

Paper
Add Code

Object landmark discovery through unsupervised adaptation

1 code implementation • NeurIPS 2019 • Enrique Sanchez, Georgios Tzimiropoulos

Contrary to previous works, we do however assume that a landmark detector, which has already learned a structured representation for a given object category in a fully supervised manner, is available.

Object Unsupervised Landmark Detection

Paper
Code

XNOR-Net++: Improved Binary Neural Networks

1 code implementation • 30 Sep 2019 • Adrian Bulat, Georgios Tzimiropoulos

This paper proposes an improved training algorithm for binary neural networks in which both weights and activations are binary numbers.

Ranked #7 on Classification with Binary Neural Network on ImageNet

Binarization Classification with Binary Neural Network +3

128

Paper
Code

Defensive Tensorization: Randomized Tensor Parametrization for Robust Neural Networks

no code implementations • 25 Sep 2019 • Adrian Bulat, Jean Kossaifi, Sourav Bhattacharya, Yannis Panagakis, Georgios Tzimiropoulos, Nicholas D. Lane, Maja Pantic

As deep neural networks become widely adopted for solving most problems in computer vision and audio-understanding, there are rising concerns about their potential vulnerability.

Adversarial Defense Audio Classification +1

Paper
Add Code

AnimalWeb: A Large-Scale Hierarchical Dataset of Annotated Animal Faces

1 code implementation • CVPR 2020 • Muhammad Haris Khan, John McDonagh, Salman Khan, Muhammad Shahabuddin, Aditya Arora, Fahad Shahbaz Khan, Ling Shao, Georgios Tzimiropoulos

Several studies show that animal needs are often expressed through their faces.

Face Alignment Face Detection

Paper
Code

Matrix and tensor decompositions for training binary neural networks

no code implementations • 16 Apr 2019 • Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic

This paper is on improving the training of binary neural networks in which both activations and weights are binary.

Ranked #8 on Classification with Binary Neural Network on ImageNet

Binarization Classification with Binary Neural Network +4

Paper
Add Code

Incremental multi-domain learning with network latent tensor factorization

no code implementations • 12 Apr 2019 • Adrian Bulat, Jean Kossaifi, Georgios Tzimiropoulos, Maja Pantic

Adapting the learned classification to new domains is a hard problem due to at least three reasons: (1) the new domains and the tasks might be drastically different; (2) there might be very limited amount of annotated data on the new domain and (3) full training of a new model for each new task is prohibitive in terms of computation and memory, due to the sheer number of parameters of deep CNNs.

General Classification Image Classification +2

Paper
Add Code

Improved training of binary networks for human pose estimation and image recognition

1 code implementation • 11 Apr 2019 • Adrian Bulat, Georgios Tzimiropoulos, Jean Kossaifi, Maja Pantic

Big neural networks trained on large datasets have advanced the state-of-the-art for a large variety of challenging problems, improving performance by a large margin.

Ranked #9 on Classification with Binary Neural Network on ImageNet

Binarization Classification with Binary Neural Network +4

128

Paper
Code

T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor

no code implementations • CVPR 2019 • Jean Kossaifi, Adrian Bulat, Georgios Tzimiropoulos, Maja Pantic

In this paper, we propose to fully parametrize Convolutional Neural Networks (CNNs) with a single high-order, low-rank tensor.

Ranked #35 on Pose Estimation on MPII Human Pose

Pose Estimation

Paper
Add Code

Features Extraction Based on an Origami Representation of 3D Landmarks

no code implementations • 12 Dec 2018 • Juan Manuel Fernandez Montenegro, Mahdi Maktab Dar Oghaz, Athanasios Gkelias, Georgios Tzimiropoulos, Vasileios Argyriou

The performance evaluation demonstrates an improvement on facial emotion classification (accuracy and F1 score) that indicates the superiority of the proposed methodology.

Classification Emotion Classification +1

Paper
Add Code

Learning to Infer the Depth Map of a Hand from its Color Image

no code implementations • 6 Dec 2018 • Vassilis C. Nicodemou, Iason Oikonomidis, Georgios Tzimiropoulos, Antonis Argyros

We propose the first approach to the problem of inferring the depth map of a human hand based on a single RGB image.

3D Reconstruction Depth Estimation +1

Paper
Add Code

Pushing the boundaries of audiovisual word recognition using Residual Networks and LSTMs

no code implementations • 3 Nov 2018 • Themos Stafylakis, Muhammad Haris Khan, Georgios Tzimiropoulos

A further analysis on the utility of target word boundaries is provided, as well as on the capacity of the network in modeling the linguistic context of the target word.

Lipreading speech-recognition +1

Paper
Add Code

Audio-Visual Speech Recognition With A Hybrid CTC/Attention Architecture

no code implementations • 28 Sep 2018 • Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Georgios Tzimiropoulos, Maja Pantic

Therefore, we could use a CTC loss in combination with an attention-based model in order to force monotonic alignments and at the same time get rid of the conditional independence assumption.

Ranked #5 on Audio-Visual Speech Recognition on LRS2

Audio-Visual Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

3D Human Body Reconstruction from a Single Image via Volumetric Regression

no code implementations • 11 Sep 2018 • Aaron S. Jackson, Chris Manafas, Georgios Tzimiropoulos

This paper proposes the use of an end-to-end Convolutional Neural Network for direct reconstruction of the 3D geometry of humans via volumetric regression.

regression

Paper
Add Code

Hierarchical binary CNNs for landmark localization with limited resources

1 code implementation • 14 Aug 2018 • Adrian Bulat, Georgios Tzimiropoulos

To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation and face alignment.

Ranked #1 on 3D Face Alignment on AFLW2000-3D

3D Face Alignment Binarization +2

128

Paper
Code

To learn image super-resolution, use a GAN to learn how to do image degradation first

2 code implementations • ECCV 2018 • Adrian Bulat, Jing Yang, Georgios Tzimiropoulos

This paper is on image and face super-resolution.

Generative Adversarial Network Image Super-Resolution

214

Paper
Code

Zero-shot keyword spotting for visual speech recognition in-the-wild

1 code implementation • ECCV 2018 • Themos Stafylakis, Georgios Tzimiropoulos

Visual keyword spotting (KWS) is the problem of estimating whether a text query occurs in a given recording using only video information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Joint Action Unit localisation and intensity estimation through heatmap regression

1 code implementation • 9 May 2018 • Enrique Sanchez-Lozano, Georgios Tzimiropoulos, Michel Valstar

Contrary to previous works that try to learn an unsupervised representation of the Action Unit regions, we propose to directly and jointly estimate all AU intensities through heatmap regression, along with the location in the face where they cause visible changes.

regression

Paper
Code

End-to-end Audiovisual Speech Recognition

2 code implementations • IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018 • Stavros Petridis, Themos Stafylakis, Pingchuan Ma, Feipeng Cai, Georgios Tzimiropoulos, Maja Pantic

In presence of high levels of noise, the end-to-end audiovisual model significantly outperforms both audio-only models.

Ranked #18 on Lipreading on Lip Reading in the Wild

Lipreading speech-recognition +1

174

Paper
Code

Super-FAN: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with GANs

no code implementations • CVPR 2018 • Adrian Bulat, Georgios Tzimiropoulos

This paper addresses 2 challenging tasks: improving the quality of low resolution facial images and accurately locating the facial landmarks on such poor resolution images.

Ranked #4 on Face Hallucination on FFHQ 512 x 512 - 16x upscaling

Face Alignment Face Hallucination +1

Paper
Add Code

Deep word embeddings for visual speech recognition

1 code implementation • 30 Oct 2017 • Themos Stafylakis, Georgios Tzimiropoulos

In this paper we present a deep learning architecture for extracting word embeddings for visual speech recognition.

Lipreading speech-recognition +2

Paper
Code

Synergy Between Face Alignment and Tracking via Discriminative Global Consensus Optimization

no code implementations • ICCV 2017 • Muhammad Haris Khan, John McDonagh, Georgios Tzimiropoulos

Tracking-by-detection is drift-free but results in low accuracy fittings.

Face Alignment Open-Ended Question Answering

Paper
Add Code

Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression

1 code implementation • ICCV 2017 • Aaron S. Jackson, Adrian Bulat, Vasileios Argyriou, Georgios Tzimiropoulos

Our CNN works with just a single 2D facial image, does not require accurate alignment nor establishes dense correspondence between images, works for arbitrary facial poses and expressions, and can be used to reconstruct the whole 3D facial geometry (including the non-visible parts of the face) bypassing the construction (during training) and fitting (during testing) of a 3D Morphable Model.

Ranked #2 on 3D Face Reconstruction on Florence

3D Face Reconstruction Face Alignment +1

4,518

Paper
Code

How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks)

8 code implementations • ICCV 2017 • Adrian Bulat, Georgios Tzimiropoulos

To this end, we make the following 5 contributions: (a) we construct, for the first time, a very strong baseline by combining a state-of-the-art architecture for landmark localization with a state-of-the-art residual block, train it on a very large yet synthetically expanded 2D facial landmark dataset and finally evaluate it on all other 2D facial landmark datasets.

Ranked #1 on Face Alignment on LS3D-W Balanced

3D Face Alignment Face Alignment +1

6,822

Paper
Code

Combining Residual Networks with LSTMs for Lipreading

4 code implementations • 12 Mar 2017 • Themos Stafylakis, Georgios Tzimiropoulos

We propose an end-to-end deep learning architecture for word-level visual speech recognition.

Ranked #20 on Lipreading on Lip Reading in the Wild

Lipreading Lip Reading +2

Paper
Code

Binarized Convolutional Landmark Localizers for Human Pose Estimation and Face Alignment with Limited Resources

3 code implementations • ICCV 2017 • Adrian Bulat, Georgios Tzimiropoulos

(d) We present results for experiments on the most challenging datasets for human pose estimation and face alignment, reporting in many cases state-of-the-art performance.

Ranked #1 on Face Alignment on AFLW-Full

Binarization Face Alignment +1

211

Paper
Code

ChaLearn Looking at People and Faces of the World: Face Analysis Workshop and Challenge 2016

no code implementations • 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2016 • Sergio Escalera, Mercedes Torres Torres, Brais Martínez, Xavier Baró, Hugo Jair Escalante, Isabelle Guyon, Georgios Tzimiropoulos, Ciprian Corneanu, Marc Oliu, Mohammad Ali Bagheri, Michel Valstar

A custom-build application was used to collect and label data about the apparent age of people (as opposed to the real age).

Ranked #2 on Gender Prediction on FotW Gender

Age Estimation Gender Classification +2

Paper
Add Code

A Functional Regression approach to Facial Landmark Tracking

no code implementations • 7 Dec 2016 • Enrique Sánchez-Lozano, Georgios Tzimiropoulos, Brais Martinez, Fernando de la Torre, Michel Valstar

This paper presents a Functional Regression solution to the least squares problem, which we coin Continuous Regression, resulting in the first real-time incremental face tracker.

Face Detection Incremental Learning +2

Paper
Add Code

A CNN Cascade for Landmark Guided Semantic Part Segmentation

no code implementations • 30 Sep 2016 • Aaron Jackson, Michel Valstar, Georgios Tzimiropoulos

This paper proposes a CNN cascade for semantic part segmentation guided by pose-specific information encoded in terms of a set of landmarks (or keypoints).

Pose Estimation Segmentation

Paper
Add Code

Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge

1 code implementation • 29 Sep 2016 • Adrian Bulat, Georgios Tzimiropoulos

This paper describes our submission to the 1st 3D Face Alignment in the Wild (3DFAW) Challenge.

Ranked #1 on Face Alignment on 3DFAW

3D Face Alignment Depth Estimation +2

6,822

Paper
Code

Convolutional aggregation of local evidence for large pose face alignment

no code implementations • British Machine Vision Conference 2016 • Adrian Bulat, Georgios Tzimiropoulos

Besides playing the role of a graphical model, CNN regression is a key feature of our system, guiding the network to rely on context for predicting the location of occluded landmarks, typically encountered in very large poses.

Ranked #1 on Face Alignment on AFLW-PIFA (21 points)

Face Alignment Face Detection +1

Paper
Add Code

Human pose estimation via Convolutional Part Heatmap Regression

1 code implementation • 6 Sep 2016 • Adrian Bulat, Georgios Tzimiropoulos

Our main contribution is a CNN cascaded architecture specifically designed for learning part relationships and spatial context, and robustly inferring pose even for the case of severe part occlusions.

Ranked #10 on Pose Estimation on Leeds Sports Poses

Pose Estimation regression

107

Paper
Code

Cascaded Continuous Regression for Real-time Incremental Face Tracking

no code implementations • 3 Aug 2016 • Enrique Sánchez-Lozano, Brais Martinez, Georgios Tzimiropoulos, Michel Valstar

We then derive the incremental learning updates for CCR (iCCR) and show that it is an order of magnitude faster than standard incremental learning for cascaded regression, bringing the time required for the update from seconds down to a fraction of a second, thus enabling real-time tracking.

Face Alignment Incremental Learning +2

Paper
Add Code

Project-Out Cascaded Regression With an Application to Face Alignment

no code implementations • CVPR 2015 • Georgios Tzimiropoulos

Cascaded regression approaches have been recently shown to achieve state-of-the-art performance for many computer vision tasks.

Face Alignment regression

Paper
Add Code

Gauss-Newton Deformable Part Models for Face Alignment in-the-Wild

no code implementations • CVPR 2014 • Georgios Tzimiropoulos, Maja Pantic

To address this limitation, in this paper, we propose to jointly optimize a part-based, trained in-the-wild, flexible appearance model along with a global shape model which results in a joint translational motion model for the model parts via Gauss-Newton (GN) optimization.

Face Alignment

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.