no code implementations • 9 Apr 2024 • Deshui Miao, Xin Li, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang
In addition, we propose a spatial-temporal memory to assist feature association and temporal ID assignment and correlation.
1 code implementation • 28 Mar 2024 • Yuqing Huang, Xin Li, Zikun Zhou, YaoWei Wang, Zhenyu He, Ming-Hsuan Yang
Upon the PN tree memory, we develop corresponding walking rules for determining the state of the target and define a set of control flows to unite the tracker and the detector in different tracking scenarios.
no code implementations • 21 Feb 2024 • Kai Yang, Jan Ackermann, Zhenyu He, Guhao Feng, Bohang Zhang, Yunzhen Feng, Qiwei Ye, Di He, LiWei Wang
Our results show that while these models are expressive enough to solve general DP tasks, contrary to expectations, they require a model size that scales with the problem size.
no code implementations • 29 Jan 2024 • Zhenyu He, Guhao Feng, Shengjie Luo, Kai Yang, Di He, Jingjing Xu, Zhi Zhang, Hongxia Yang, LiWei Wang
In this work, we leverage the intrinsic segmentation of language sequences and design a new positional encoding method called Bilevel Positional Encoding (BiPE).
1 code implementation • 14 Nov 2023 • Zhenyu He, Zexuan Zhong, Tianle Cai, Jason D. Lee, Di He
We introduce Retrieval-Based Speculative Decoding (REST), a novel algorithm designed to speed up language model generation.
1 code implementation • 27 Aug 2023 • Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, Qizhi Pei, Jie Shao, Wei zhang
Generative pre-trained transformer (GPT) models have revolutionized the field of natural language processing (NLP) with remarkable performance in various tasks and also extend their power to multimodal domains.
no code implementations • 24 Aug 2023 • Zikun Zhou, Shukun Wu, Guoqing Zhu, Hongpeng Wang, Zhenyu He
In this paper, we propose a Channel and Spatial Relation-Propagation Network (CSRPNet) for RGB-T semantic segmentation, which propagates only modality-shared information across different modalities and alleviates the modality-specific information contamination issue.
Ranked #12 on Thermal Image Segmentation on PST900
no code implementations • 23 Aug 2023 • Chao Tian, Zikun Zhou, Yuqing Huang, Gaojun Li, Zhenyu He
RGB-Thermal (RGB-T) pedestrian detection aims to locate the pedestrians in RGB-T image pairs to exploit the complementation between the two modalities for improving detection robustness in extreme conditions.
1 code implementation • ICCV 2023 • Xin Li, Yuqing Huang, Zhenyu He, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang
Existing visual tracking methods typically take an image patch as the reference of the target to perform tracking.
1 code implementation • ICCV 2023 • Junjie Fei, Teng Wang, Jinrui Zhang, Zhenyu He, Chengjie Wang, Feng Zheng
In this paper, we propose ViECap, a transferable decoding model that leverages entity-aware decoding to generate descriptions in both seen and unseen scenarios.
no code implementations • 29 May 2023 • Jianqiu Chen, Mingshan Sun, Tianpeng Bao, Rui Zhao, Liwei Wu, Zhenyu He
In this paper, we present a CAD model-based zero-shot pose estimation pipeline called ZeroPose.
1 code implementation • 25 Mar 2023 • Zikun Zhou, Kaige Mao, Wenjie Pei, Hongpeng Wang, YaoWei Wang, Zhenyu He
To be specific, RHMNet first only uses the memory in the high-reliability level to locate the region with high reliability belonging to the target, which is highly similar to the initial target scribble.
1 code implementation • CVPR 2023 • Li Zhou, Zikun Zhou, Kaige Mao, Zhenyu He
Such a separated framework overlooks the link between visual grounding and tracking, which is that the natural language descriptions provide global semantic cues for localizing the target for both two steps.
Ranked #3 on Visual Tracking on TNL2K
no code implementations • 17 Jan 2023 • Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He
Finally, we demonstrate that our method can be readily used to generate motion sequences with user-specified motion clips on the timeline.
no code implementations • 20 Oct 2022 • Jianqiu Chen, Mingshan Sun, Ye Zheng, Tianpeng Bao, Zhenyu He, Donghai Li, Guoqiang Jin, Rui Zhao, Liwei Wu, Xiaoke Jiang
Numerous 6D pose estimation methods have been proposed that employ end-to-end regression to directly estimate the target pose parameters.
no code implementations • 4 Oct 2022 • Honghu Pan, Yongyong Chen, Yunqi He, Xin Li, Zhenyu He
To this end, we propose Flow2Flow, a unified framework that could jointly achieve training sample expansion and cross-modality image generation for V2I person ReID.
no code implementations • 23 Sep 2022 • Honghu Pan, Yongyong Chen, Zhenyu He
To downsample the graph, we propose a multi-head full attention graph pooling (MHFAPool) layer, which integrates the advantages of existing node clustering and node selection pooling methods.
no code implementations • 23 Sep 2022 • Honghu Pan, Yongyong Chen, Tingyang Xu, Yunqi He, Zhenyu He
Extensive experiments on two large gait recognition datasets, i. e., CASIA-B and OUMVLP-Pose, demonstrate that our method outperforms the baseline model and existing pose-based methods by a large margin.
no code implementations • 23 Sep 2022 • Honghu Pan, Qiao Liu, Yongyong Chen, Yunqi He, Yuan Zheng, Feng Zheng, Zhenyu He
Finally, we propose a dual-attention method consisting of node-attention and time-attention to obtain the temporal graph representation from the node embeddings, where the self-attention mechanism is employed to learn the importance of each node and each frame.
no code implementations • 30 Aug 2022 • Yi Li, Wenjie Pei, Zhenyu He
In this paper, we attempt to build a deep learning model that mimics all four steps in the traditional homography estimation pipeline.
no code implementations • 26 Jun 2022 • Mengyan Zhang, Thanh Nguyen-Tang, Fangzhao Wu, Zhenyu He, Xing Xie, Cheng Soon Ong
We consider the problem of personalised news recommendation where each user consumes news in a sequential fashion.
1 code implementation • CVPR 2022 • Zikun Zhou, Jianqiu Chen, Wenjie Pei, Kaige Mao, Hongpeng Wang, Zhenyu He
While it can exploit the temporal context like historical appearances and locations of the target, a potential limitation of such strategy is that the local tracker tends to misidentify a nearby distractor as the target instead of activating the re-detector when the real target is out of view.
1 code implementation • 8 Mar 2022 • Jingfei Xia, Mingchen Zhuge, Tiantian Geng, Shun Fan, Yuantai Wei, Zhenyu He, Feng Zheng
Figure skating scoring is challenging because it requires judging the technical moves of the players as well as their coordination with the background music.
no code implementations • 28 Dec 2021 • Peng Tu, Yawen Huang, Feng Zheng, Zhenyu He, Liujun Cao, Ling Shao
In this paper, we propose a novel method for semi-supervised semantic segmentation named GuidedMix-Net, by leveraging labeled information to guide the learning of unlabeled instances.
no code implementations • 17 Oct 2021 • Di Yuan, Xiaojun Chang, Yi Yang, Qiao Liu, Dehua Wang, Zhenyu He
In this paper, we propose an active learning method for deep visual tracking, which selects and annotates the unlabeled samples to train the deep CNNs model.
no code implementations • ICCV 2021 • Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Zhenyu He, Linchao Bao
In order to overcome this problem, we propose a novel conditional variational autoencoder (VAE) that explicitly models one-to-many audio-to-motion mapping by splitting the cross-modal latent code into shared code and motion-specific code.
Ranked #3 on Gesture Generation on BEAT
1 code implementation • ICCV 2021 • Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He
A potential limitation of such trackers is that not all patches are equally informative for tracking.
no code implementations • 21 Jun 2021 • Xin Li, Wenjie Pei, YaoWei Wang, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang
While deep-learning based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training.
1 code implementation • 15 Apr 2021 • Kai Yang, Zhenyu He, Wenjie Pei, Zikun Zhou, Xin Li, Di Yuan, Haijun Zhang
By tracking a target as a pair of corners, we avoid the need to design the anchor boxes.
no code implementations • 13 Sep 2020 • Honghu Pan, Fanyang Meng, Nana Fan, Zhenyu He
Our method has the following two advantages: (1) We are the first to consider neighborhood information of descriptors, while former works mainly focus on neighborhood consistency of feature points; (2) Our method can be applied in any former work of learning descriptors by triplet loss.
1 code implementation • 3 Aug 2020 • Qiao Liu, Xin Li, Zhenyu He, Chenglong Li, Jun Li, Zikun Zhou, Di Yuan, Jing Li, Kai Yang, Nana Fan, Feng Zheng
We evaluate and analyze more than 30 trackers on LSOTB-TIR to provide a series of baselines, and the results show that deep trackers achieve promising performance.
Thermal Infrared Object Tracking Vocal Bursts Intensity Prediction
no code implementations • 3 Jul 2020 • Di Yuan, Xiu Shu, Nana Fan, Xiaojun Chang, Qiao Liu, Zhenyu He
Moreover, we introduce a classification part that is trained online and optimized with a Conjugate-Gradient-based strategy to guarantee real-time tracking speed.
no code implementations • 5 Jun 2020 • Honghu Pan, Fanyang Meng, Zhenyu He, Yongsheng Liang, Wei Liu
Then we define topology distance between descriptors as the difference of their topology vectors.
1 code implementation • 26 Nov 2019 • Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, Wei Liu, Yonsheng Liang
These two feature models are learned using a multi-task matching framework and are jointly optimized on the TIR tracking task.
1 code implementation • 9 Jun 2019 • Qiao Liu, Xin Li, Zhenyu He, Nana Fan, Di Yuan, Hongpeng Wang
These two similarities complement each other and hence enhance the discriminative capacity of the network for handling distractors.
no code implementations • CVPR 2019 • Xin Li, Chao Ma, Baoyuan Wu, Zhenyu He, Ming-Hsuan Yang
Despite demonstrated successes for numerous vision tasks, the contributions of using pre-trained deep features for visual tracking are not as significant as that for object recognition.
no code implementations • 23 Mar 2018 • Nana Fan, Zhenyu He
The IRs in training samples from cyclic shifts of the base training sample severely degrade the quality of a tracking model.
1 code implementation • 18 Jan 2018 • Qiao Liu, Zhenyu He, Xin Li, Yuan Zheng
The ability to evaluate the TIR pedestrian tracker fairly, on a benchmark dataset, is significant for the development of this field.
1 code implementation • 27 Nov 2017 • Xin Li, Qiao Liu, Nana Fan, Zhenyu He, Hongzhi Wang
In this paper, we cast the TIR tracking problem as a similarity verification task, which is coupled well to the objective of the tracking task.
1 code implementation • Knowledge-Based Systems 2017 • QiaoLiu, Xiaohuan Lu, Zhenyu He, Chunkai Zhang, WenSheng Chen
We observe that the features from the fully-connected layer are not suitable for thermal infrared tracking due to the lack of spatial information of the target, while the features from the convolution layers are.