no code implementations • 29 Feb 2024 • Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Ekaterina Deyneka, Hsiang-wei Chao, Byung Eun Jeon, Yuwei Fang, Hsin-Ying Lee, Jian Ren, Ming-Hsuan Yang, Sergey Tulyakov
Next, we finetune a retrieval model on a small subset where the best caption of each video is manually selected and then employ the model in the whole dataset to select the best caption as the annotation.
no code implementations • 22 Feb 2024 • Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina Deyneka, Tsai-Shien Chen, Anil Kag, Yuwei Fang, Aleksei Stoliar, Elisa Ricci, Jian Ren, Sergey Tulyakov
Since video content is highly redundant, we argue that naively bringing advances of image models to the video generation domain reduces motion fidelity, visual quality and impairs scalability.
Ranked #1 on Text-to-Video Generation on MSR-VTT
no code implementations • 27 Apr 2023 • Tsai-Shien Chen, Chieh Hubert Lin, Hung-Yu Tseng, Tsung-Yi Lin, Ming-Hsuan Yang
In response to this gap, we introduce MCDiff, a conditional diffusion model that generates a video from a starting image frame and a set of strokes, which allow users to specify the intended content and dynamics for synthesis.
no code implementations • 29 Sep 2021 • Tsai-Shien Chen, Chih-Ting Liu, Shao-Yi Chien
Fine-grained recognition aims to discriminate the sub-categories of the images within one general category.
no code implementations • 14 Jun 2021 • Chih-Ting Liu, Man-Yu Lee, Tsai-Shien Chen, Shao-Yi Chien
Person re-identification (re-ID) has received great success with the supervised learning methods.
no code implementations • ICLR 2022 • Tsai-Shien Chen, Wei-Chih Hung, Hung-Yu Tseng, Shao-Yi Chien, Ming-Hsuan Yang
Self-supervised learning has recently shown great potential in vision tasks through contrastive learning, which aims to discriminate each image, or instance, in the dataset.
no code implementations • 12 Oct 2020 • Tsai-Shien Chen, Man-Yu Lee, Chih-Ting Liu, Shao-Yi Chien
Our VCAM enables the feature learning framework channel-wisely reweighing the importance of each feature maps according to the "viewpoint" of input vehicle.
1 code implementation • ECCV 2020 • Tsai-Shien Chen, Chih-Ting Liu, Chih-Wei Wu, Shao-Yi Chien
Vehicle re-identification (re-ID) focuses on matching images of the same vehicle across different cameras.