1 code implementation • 28 Mar 2024 • Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang, Minwoo Lee, Srijan Das, Chen Chen
To address these challenges, we propose Bidirectional Autoregressive Motion Model (BAMM), a novel text-to-motion generation framework.
no code implementations • 19 Jan 2024 • Arijit Das, Somashree Nandy, Rupam Saha, Srijan Das, Diganta Saha
In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc.
1 code implementation • 22 Dec 2023 • Saarthak Kapse, Pushpak Pati, Srijan Das, Jingwei Zhang, Chao Chen, Maria Vakalopoulou, Joel Saltz, Dimitris Samaras, Rajarsi R. Gupta, Prateek Prasanna
Introducing interpretability and reasoning into Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the complexity of gigapixel slides.
no code implementations • 7 Dec 2023 • Aritra Dutta, Srijan Das, Jacob Nielsen, Rajatsubhra Chakraborty, Mubarak Shah
Despite the commercial abundance of UAVs, aerial data acquisition remains challenging, and the existing Asia and North America-centric open-source UAV datasets are small-scale or low-resolution and lack diversity in scene contextuality.
1 code implementation • 30 Nov 2023 • Dominick Reilly, Srijan Das
To facilitate the adoption of video transformers for ADL, we hypothesize that the augmentation of RGB with human pose information, known for its sensitivity to fine-grained motion and multiple viewpoints, is essential.
Ranked #1 on Action Classification on Toyota Smarthome dataset (using extra training data)
1 code implementation • 31 Oct 2023 • Srijan Das, Tanmay Jain, Dominick Reilly, Pranav Balaji, Soumyajit Karmakar, Shyam Marjit, Xiang Li, Abhijit Das, Michael S. Ryoo
We explore the appropriate SSL tasks that can be optimized alongside the primary task, the training schemes for these tasks, and the data scale at which they can be most effective.
no code implementations • 12 Sep 2023 • Saarthak Kapse, Srijan Das, Jingwei Zhang, Rajarsi R. Gupta, Joel Saltz, Dimitris Samaras, Prateek Prasanna
We propose DiRL, a Diversity-inducing Representation Learning technique for histopathology imaging.
no code implementations • 25 Aug 2023 • Pranav Balaji, Abhijit Das, Srijan Das, Antitza Dantcheva
This work explores various ways of exploring multi-task learning (MTL) techniques aimed at classifying videos as original or manipulated in cross-manipulation scenario to attend generalizability in deep fake scenario.
1 code implementation • 15 Jun 2023 • Dominick Reilly, Aman Chadha, Srijan Das
Both PAAT and PAAB surpass their respective backbone Transformers by up to 9. 8% in real-world action recognition and 21. 8% in multi-view robotic video alignment.
1 code implementation • 1 Jul 2022 • Srijan Das, Michael S. Ryoo
The CLIP embedding provides fine-grained understanding of objects relevant for an action whereas the slowfast network is responsible for modeling temporal information within a video clip of few frames.
1 code implementation • 23 Jun 2022 • Jinghuan Shang, Srijan Das, Michael S. Ryoo
To this end, we propose a 3D Token Representation Layer (3DTRL) that estimates the 3D positional information of the visual tokens and leverages it for learning viewpoint-agnostic representations.
2 code implementations • 10 Jun 2022 • Xiang Li, Jinghuan Shang, Srijan Das, Michael S. Ryoo
We investigate whether self-supervised learning (SSL) can improve online reinforcement learning (RL) from pixels.
1 code implementation • 28 Mar 2022 • Saarthak Kapse, Srijan Das, Prateek Prasanna
To jointly leverage complementary information from multiple resolutions, we present a novel transformer based Pyramidal Context-Detail Network (CD-Net).
no code implementations • 7 Dec 2021 • Srijan Das, Michael S. Ryoo
Learning self-supervised video representation predominantly focuses on discriminating instances generated from simple data augmentation schemes.
no code implementations • 7 Dec 2021 • Srijan Das, Michael S. Ryoo
To this end, we propose Cross-Modal Manifold Cutmix (CMMC) that inserts a video tesseract into another video tesseract in the feature space across two different modalities.
1 code implementation • CVPR 2022 • Rui Dai, Srijan Das, Kumara Kahatapitiya, Michael S. Ryoo, Francois Bremond
Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos.
Ranked #2 on Action Detection on TSU
no code implementations • 26 Oct 2021 • Rui Dai, Srijan Das, Francois Bremond
Action detection is an essential and challenging task, especially for densely labelled datasets of untrimmed videos.
Ranked #2 on Action Detection on Multi-THUMOS
no code implementations • 29 Sep 2021 • Srijan Das, Michael S Ryoo
We find that our video mixing strategy: Vi-Mix, i. e. preliminary mixing of videos followed by CMMC across different modalities in a video, improves the qual- ity of learned video representations.
no code implementations • 20 Aug 2021 • Snehashis Majhi, Srijan Das, Francois Bremond, Ratnakar Dash, Pankaj Kumar Sa
Thinking of a fully automatized surveillance system, which is capable of both detecting and classifying the anomalies that need immediate actions, a joint anomaly detection and classification method is a pressing need.
no code implementations • ICCV 2021 • Rui Dai, Srijan Das, Francois Bremond
On the other hand, sequence-level distillation encourages the student to learn the temporal knowledge from the teacher, which consists of transferring the Global Contextual Relations and the Action Boundary Saliency.
1 code implementation • 17 May 2021 • Srijan Das, Rui Dai, Di Yang, Francois Bremond
But the cost of computing 3D poses from RGB stream is high in the absence of appropriate sensors.
Ranked #10 on Action Recognition on NTU RGB+D 120 (using extra training data)
1 code implementation • 5 Jan 2021 • Rui Dai, Srijan Das, Luca Minciullo, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond
Previous action detection methods fail in selecting the key temporal information in long videos.
Ranked #1 on Action Detection on TSU
1 code implementation • 28 Oct 2020 • Rui Dai, Srijan Das, Saurav Sharma, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca
Therefore, we propose a new baseline method for activity detection to tackle the novel challenges provided by our dataset.
1 code implementation • ECCV 2020 • Srijan Das, Saurav Sharma, Rui Dai, Francois Bremond, Monique Thonnat
The 2 key components of this VPN are a spatial embedding and an attention network.
Ranked #6 on Action Classification on Toyota Smarthome dataset (using extra training data)
no code implementations • ICCV 2019 • Srijan Das, Rui Dai, Michal Koperski, Luca Minciullo, Lorenzo Garattoni, Francois Bremond, Gianpiero Francesca
As recent activity recognition approaches fail to address the challenges posed by Toyota Smarthome, we present a novel activity recognition method with attention mechanism.
Ranked #7 on Action Classification on Toyota Smarthome dataset (using extra training data)
no code implementations • 1 Feb 2018 • Srijan Das, Michal Koperski, Francois Bremond, Gianpiero Francesca
In this paper, we propose to improve the traditional use of RNNs by employing a many to many model for video classification.