no code implementations • 26 Oct 2021 • Saurabh Sahu, Palash Goyal
In this paper, we propose a novel self-attention block that leverages both local and global temporal relationships between the video frames to obtain better contextualized representations for the individual frames.
no code implementations • 26 Oct 2021 • Divya Choudhary, Palash Goyal, Saurabh Sahu
To address this, several techniques have been proposed to increase robustness of a model for image classification tasks.
no code implementations • 18 Mar 2021 • Saurabh Sahu, Palash Goyal
GAT uses a multi-level attention gate to model the relevance of a frame based on local and global contexts.
no code implementations • 7 Mar 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Multi-modal machine learning (ML) models can process data in multiple modalities (e. g., video, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding, activity recognition).
no code implementations • 7 Feb 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee
Multimodal ML models can process data in multiple modalities (e. g., video, images, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding).
no code implementations • 31 Oct 2019 • Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson
In this work, we experiment with variants of GAN architectures to generate feature vectors corresponding to an emotion in two ways: (i) A generator is trained with samples from a mixture prior.
no code implementations • 18 Jun 2018 • Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson
GANs consist of a discriminator and a generator working in tandem playing a min-max game to learn a target underlying data distribution; when fed with data-points sampled from a simpler distribution (like uniform or Gaussian distribution).
no code implementations • 7 Jun 2018 • Rahul Gupta, Saurabh Sahu, Carol Espy-Wilson, Shrikanth Narayanan
Sentiment classification involves quantifying the affective reaction of a human to a document, media item or an event.
no code implementations • 6 Jun 2018 • Saurabh Sahu, Rahul Gupta, Ganesh Sivaraman, Wael Abd-Almageed, Carol Espy-Wilson
Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition.