no code implementations • 24 Jan 2024 • Hai X. Pham, Isma Hadji, Xinnuo Xu, Ziedune Degutyte, Jay Rainey, Evangelos Kazakos, Afsaneh Fazly, Georgios Tzimiropoulos, Brais Martinez
The key technological enabler is a novel mechanism for automatic question-answer generation from procedural text which can ingest large amounts of textual instructions and produce exhaustive in-domain QA training data.
1 code implementation • 4 Feb 2021 • Hai X. Pham, Ricardo Guerrero, Jiatong Li, Vladimir Pavlovic
Despite the abundance of multi-modal data, such as image-text pairs, there has been little effort in understanding the individual entities and their different roles in the construction of these data instances.
no code implementations • 21 Mar 2018 • Hai X. Pham, Yuting Wang, Vladimir Pavlovic
This paper presents Generative Adversarial Talking Head (GATH), a novel deep generative neural network that enables fully automatic facial expression synthesis of an arbitrary portrait with continuous action unit (AU) coefficients.
no code implementations • 2 Oct 2017 • Hai X. Pham, Yuting Wang, Vladimir Pavlovic
We present a deep learning framework for real-time speech-driven 3D facial animation from just raw waveforms.
no code implementations • 10 Jul 2015 • Hai X. Pham, Chongyu Chen, Luc N. Dao, Vladimir Pavlovic, Jianfei Cai, Tat-Jen Cham
We introduce a novel robust hybrid 3D face tracking framework from RGBD video streams, which is capable of tracking head pose and facial actions without pre-calibration or intervention from a user.