no code implementations • 12 Apr 2024 • Hyesong Choi, Hyejin Park, Kwang Moo Yi, Sungmin Cha, Dongbo Min
In this paper, we introduce Saliency-Based Adaptive Masking (SBAM), a novel and cost-effective approach that significantly enhances the pre-training performance of Masked Image Modeling (MIM) approaches by prioritizing token salience.
no code implementations • 12 Apr 2024 • Hyesong Choi, Hunsang Lee, Seyoung Joung, Hyejin Park, Jiyeong Kim, Dongbo Min
Initially, we delve into an exploration of the inherent properties that a masked token ought to possess.
1 code implementation • ICCV 2023 • Hyesong Choi, Hunsang Lee, Seongwon Jeong, Dongbo Min
Generalization capability of vision-based deep reinforcement learning (RL) is indispensable to deal with dynamic environment changes that exist in visual observations.
no code implementations • CVPR 2023 • Hyesong Choi, Hunsang Lee, Wonil Song, Sangryul Jeon, Kwanghoon Sohn, Dongbo Min
Recent vision-based reinforcement learning (RL) methods have found extracting high-level features from raw pixels with self-supervised learning to be effective in learning policies.
1 code implementation • 6 Sep 2022 • Sunkyung Kim, Hyesong Choi, Dongbo Min
The cross-task attention module (CTAM) is first applied to facilitate the exchange of relevant information between the multiple task features of the same scale.
no code implementations • CVPR 2022 • Hunsang Lee, Hyesong Choi, Kwanghoon Sohn, Dongbo Min
In this way, the pair-wise operation establishes non-local connectivity while maintaining the desired properties of the local attention, i. e., inductive bias of locality and linear complexity to input resolution.
no code implementations • 29 Sep 2021 • Wonil Song, Sangryul Jeon, Hyesong Choi, Kwanghoon Sohn, Dongbo Min
Given the latent representations as skills, a skill-based policy network is trained to generate similar trajectories to the learned decoder of the trajectory VAE.
no code implementations • 29 Sep 2021 • Hyesong Choi, Hunsang Lee, Wonil Song, Sangryul Jeon, Kwanghoon Sohn, Dongbo Min
The proposed method imposes similarity constraints on the three latent volumes; warped query representations by estimated flows, predicted target representations from the transition model, and target representations of future state.
no code implementations • 1 Jan 2021 • Sunkyung Kim, Hyesong Choi, Dongbo Min
More importantly, the pseudo depth labels serve to impose a cross-view consistency on the estimated monocular depth and segmentation maps of two views.
1 code implementation • ICCV 2021 • Hyesong Choi, Hunsang Lee, Sunkyung Kim, Sunok Kim, Seungryong Kim, Kwanghoon Sohn, Dongbo Min
To cope with the prediction error of the confidence map itself, we also leverage the threshold network that learns the threshold dynamically conditioned on the pseudo depth maps.