Search Results for author: Xuefei Cao

Found 5 papers, 2 papers with code

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

1 code implementation • 8 Apr 2024 • Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim

However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.

Ranked #1 on Video Classification on COIN

Question Answering Video Captioning +4

129

Paper
Code

Unifying Tracking and Image-Video Object Detection

no code implementations • 20 Nov 2022 • Peirong Liu, Rui Wang, Pengchuan Zhang, Omid Poursaeed, Yipin Zhou, Xuefei Cao, Sreya Dutta Roy, Ashish Shah, Ser-Nam Lim

We propose TrIVD (Tracking and Image-Video Detection), the first framework that unifies image OD, video OD, and MOT within one end-to-end model.

Multi-Object Tracking Object +2

Paper
Add Code

Object-Centric Unsupervised Image Captioning

1 code implementation • 2 Dec 2021 • Zihang Meng, David Yang, Xuefei Cao, Ashish Shah, Ser-Nam Lim

Our work in this paper overcomes this by harvesting objects corresponding to a given sentence from the training set, even if they don't belong to the same image.

Image Captioning Object +1

Paper
Code

Differential Motion Evolution for Fine-Grained Motion Deformation in Unsupervised Image Animation

no code implementations • 9 Oct 2021 • Peirong Liu, Rui Wang, Xuefei Cao, Yipin Zhou, Ashish Shah, Ser-Nam Lim

Key findings are twofold: (1) by capturing the motion transfer with an ordinary differential equation (ODE), it helps to regularize the motion field, and (2) by utilizing the source image itself, we are able to inpaint occluded/missing regions arising from large motion changes.

Image Animation Motion Estimation

Paper
Add Code

Unsupervised Deep Metric Learning via Auxiliary Rotation Loss

no code implementations • 16 Nov 2019 • Xuefei Cao, Bor-Chun Chen, Ser-Nam Lim

In this work, we propose to generate pseudo-labels for deep metric learning directly from clustering assignment and we introduce unsupervised deep metric learning (UDML) regularized by a self-supervision (SS) task.

Clustering Image Retrieval +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.