1 code implementation • 28 Apr 2024 • Mingzhen Huang, Shan Jia, Zhou Zhou, Yan Ju, Jialing Cai, Siwei Lyu
In the battle against widespread online misinformation, a growing problem is text-image inconsistency, where images are misleadingly paired with texts with different intent or meaning.
1 code implementation • 18 Aug 2023 • Yuanhao Zhai, Mingzhen Huang, Tianyu Luan, Lu Dong, Ifeoma Nwogu, Siwei Lyu, David Doermann, Junsong Yuan
In this paper, we propose ATOM (ATomic mOtion Modeling) to mitigate this problem, by decomposing actions into atomic actions, and employing a curriculum learning strategy to learn atomic action composition.
1 code implementation • 14 Apr 2023 • Shan Jia, Mingzhen Huang, Zhou Zhou, Yan Ju, Jialing Cai, Siwei Lyu
To achieve this, we propose a new approach that leverages the DALL-E2 language-image model to automatically generate and splice masked regions guided by a text prompt.
no code implementations • CVPR 2023 • Mingzhen Huang, Xiaoxing Li, Jun Hu, Honghong Peng, Siwei Lyu
DETracker outperforms existing state-of-the-art method on the DogThruGlasses dataset and YouTube-Hand dataset.
1 code implementation • CVPR 2022 • Supreeth Narasimhaswamy, Thanh Nguyen, Mingzhen Huang, Minh Hoai
We also introduce a new challenging dataset called BodyHands containing unconstrained images with hand and their corresponding body locations annotations.
1 code implementation • CVPR 2022 • Mingzhen Huang, Supreeth Narasimhaswamy, Saif Vazir, Haibin Ling, Minh Hoai
The first stage is Forward Propagation, where the features from frame t-1 are propagated to frame t based on previously detected hands and their estimated motion.
Ranked #1 on Multiple Object Tracking on YouTube-Hands (using extra training data)
no code implementations • ICCV 2021 • Jingyi Xu, Hieu Le, Mingzhen Huang, ShahRukh Athar, Dimitris Samaras
We assume that the distribution of intra-class variance generalizes across the base class and the novel class.
Ranked #15 on Few-Shot Image Classification on CUB 200 5-way 5-shot
1 code implementation • 8 Sep 2020 • Heng Fan, Hexin Bai, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Harshit, Mingzhen Huang, Juehuan Liu, Yong Xu, Chunyuan Liao, Lin Yuan, Haibin Ling
The average video length of LaSOT is around 2, 500 frames, where each video contains various challenge factors that exist in real world video footage, such as the targets disappearing and re-appearing.