no code implementations • 8 May 2024 • Qing Yu, Mikihiro Tanaka, Kent Fujiwara
To build a cross-modal latent space between 3D human motion and language, acquiring large-scale and high-quality human motion data is crucial.
1 code implementation • 23 Oct 2023 • Shuhei Yokoo, Peifei Zhu, Yuchi Ishikawa, Mikihiro Tanaka, Masayoshi Kondo, Hirokatsu Kataoka
Our solution adopts large multimodal models CLIP and BLIP-2 to filter and modify web crawl data, and utilize external datasets along with a bag of tricks to improve the data quality.
no code implementations • ICCV 2023 • Mikihiro Tanaka, Kent Fujiwara
We claim that certain interactions, which we call asymmetric interactions, involve a relationship between an actor and a receiver, whose motions significantly differ depending on the assigned role.
no code implementations • 6 Mar 2020 • Mikihiro Tanaka, Tatsuya Harada
In this study, we introduce a low cost method for generating descriptions from images containing novel objects.
2 code implementations • ICCV 2019 • Mikihiro Tanaka, Takayuki Itamochi, Kenichi Narioka, Ikuro Sato, Yoshitaka Ushiku, Tatsuya Harada
Moreover, we regard that sentences that are easily understood are those that are comprehended correctly and quickly by humans.