no code implementations • MML (ACL) 2022 • SeongJun Jung, Woo Suk Choi, SeongHo Choi, Byoung-Tak Zhang
Recent GAN-based text-to-image generation models have advanced that they can generate photo-realistic images matching semantically with descriptions.
Generative Adversarial Network Multi-lingual Text-to-Image Generation +2
no code implementations • 22 Mar 2024 • Seongjun Jeong, Gi-Cheon Kang, SeongHo Choi, Joochan Kim, Byoung-Tak Zhang
For the training and evaluation of CVLN agents, we re-arrange existing VLN datasets to propose two datasets: CVLN-I, focused on navigation via initial-instruction interpretation, and CVLN-D, aimed at navigation through dialogue with other agents.
1 code implementation • 5 Jun 2023 • Minjoon Jung, Youwon Jang, SeongHo Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang
Video moment retrieval (VMR) identifies a specific moment in an untrimmed video for a given natural language query.
Ranked #6 on Moment Retrieval on Charades-STA
1 code implementation • 23 Oct 2022 • Minjoon Jung, SeongHo Choi, Joochan Kim, Jin-Hwa Kim, Byoung-Tak Zhang
Video corpus moment retrieval (VCMR) is the task to retrieve the most relevant video moment from a large video corpus using a natural language query.
Ranked #2 on Video Corpus Moment Retrieval on TVR
no code implementations • 8 Oct 2021 • Yu-Jung Heo, Minsu Lee, SeongHo Choi, Woo Suk Choi, Minjung Shin, Minjoon Jung, Jeh-Kwang Ryu, Byoung-Tak Zhang
In this paper, we propose the Video Turing Test to provide effective and practical assessments of video understanding intelligence as well as human-likeness evaluation of AI agents.
no code implementations • 11 Aug 2021 • Donggeon Lee, SeongHo Choi, Youwon Jang, Byoung-Tak Zhang
In this paper, we challenge the existing multiple-choice video question answering by changing it to open-ended video question answering.
no code implementations • 21 Jul 2021 • Minjung Shin, SeongHo Choi, Yu-Jung Heo, Minsu Lee, Byoung-Tak Zhang, Jeh-Kwang Ryu
We introduce CogME, a cognition-inspired, multi-dimensional evaluation metric designed for AI models focusing on story understanding.
no code implementations • 1 Apr 2019 • Yu-Jung Heo, Kyoung-Woon On, SeongHo Choi, Jaeseo Lim, Jinah Kim, Jeh-Kwang Ryu, Byung-Chull Bae, Byoung-Tak Zhang
Video understanding is emerging as a new paradigm for studying human-like AI.