no code implementations • 28 Mar 2024 • Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu Zhang
Autonomous driving progress relies on large-scale annotated datasets.
no code implementations • 17 Jan 2024 • Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao
This paper introduces the Stream Query Denoising (SQD) strategy as a novel approach for temporal modeling in high-definition map (HD-map) construction.
1 code implementation • 21 Dec 2023 • Haochen Wang, Junsong Fan, Yuxi Wang, Kaiyou Song, Tiancai Wang, Xiangyu Zhang, Zhaoxiang Zhang
To empower the model as a teacher, we propose Hard Patches Mining (HPM), predicting patch-wise losses and subsequently determining where to mask.
2 code implementations • 6 Dec 2023 • Hongyang Li, Yang Li, Huijie Wang, Jia Zeng, Huilin Xu, Pinlong Cai, Li Chen, Junchi Yan, Feng Xu, Lu Xiong, Jingdong Wang, Futang Zhu, Chunjing Xu, Tiancai Wang, Fei Xia, Beipeng Mu, Zhihui Peng, Dahua Lin, Yu Qiao
With the continuous maturation and application of autonomous driving technology, a systematic examination of open-source autonomous driving datasets becomes instrumental in fostering the robust evolution of the industry ecosystem.
no code implementations • 30 Nov 2023 • En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao
Then, FIT requires MLLMs to first predict trajectories of related objects and then reason about potential future events based on them.
Ranked #66 on Visual Question Answering on MM-Vet
no code implementations • 29 Nov 2023 • Weixin Mao, Tiancai Wang, Diankun Zhang, Junjie Yan, Osamu Yoshie
Pillar-based methods mainly employ randomly initialized 2D convolution neural network (ConvNet) for feature extraction and fail to enjoy the benefits from the backbone scaling and pretraining in the image domain.
no code implementations • 28 Nov 2023 • Yuqing Wen, Yucheng Zhao, Yingfei Liu, Fan Jia, Yanhui Wang, Chong Luo, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang
This work notably propels the field of autonomous driving by effectively augmenting the training dataset used for advanced BEV perception techniques.
no code implementations • 22 Nov 2023 • Fan Jia, Weixin Mao, Yingfei Liu, Yucheng Zhao, Yuqing Wen, Chi Zhang, Xiangyu Zhang, Tiancai Wang
Based on the vision-action pairs, we construct a general world model based on MLLM and diffusion model for autonomous driving, termed ADriver-I.
no code implementations • 20 Nov 2023 • Shuailin Li, Yuang Zhang, Yucheng Zhao, Qiuyue Wang, Fan Jia, Yingfei Liu, Tiancai Wang
Despite the rapid development of video Large Language Models (LLMs), a comprehensive evaluation is still absent.
1 code implementation • 10 Oct 2023 • Dongming Wu, Jiahao Chang, Fan Jia, Yingfei Liu, Tiancai Wang, Jianbing Shen
Further, we propose TopoMLP, a simple yet high-performance pipeline for driving topology reasoning.
Ranked #3 on 3D Lane Detection on OpenLane-V2 val
2 code implementations • 8 Sep 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Yingfei Liu, Xiangyu Zhang, Jianbing Shen
A new trend in the computer vision community is to capture objects of interest following flexible human command represented by a natural language prompt.
1 code implementation • 18 Aug 2023 • Xiaohui Jiang, Shuailin Li, Yingfei Liu, Shihao Wang, Fan Jia, Tiancai Wang, Lijin Han, Xiangyu Zhang
Recently 3D object detection from surround-view images has made notable advancements with its low deployment cost.
Ranked #1 on 3D Object Detection on nuScenes Camera Only
1 code implementation • ICCV 2023 • Dongming Wu, Tiancai Wang, Yuang Zhang, Xiangyu Zhang, Jianbing Shen
Referring video object segmentation (RVOS) aims at segmenting an object in a video following human instruction.
Referring Expression Segmentation Referring Video Object Segmentation +2
1 code implementation • 16 Jun 2023 • Dongming Wu, Fan Jia, Jiahao Chang, Zhuoling Li, Jianjian Sun, Chunrui Han, Shuailin Li, Yingfei Liu, Zheng Ge, Tiancai Wang
We present the 1st-place solution of OpenLane Topology in Autonomous Driving Challenge.
no code implementations • 23 May 2023 • En Yu, Tiancai Wang, Zhuoling Li, Yuang Zhang, Xiangyu Zhang, Wenbing Tao
Although end-to-end multi-object trackers like MOTR enjoy the merits of simplicity, they suffer from the conflict between detection and association seriously, resulting in unsatisfactory convergence dynamics.
1 code implementation • ICCV 2023 • Shihao Wang, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang
On the standard nuScenes benchmark, it is the first online multi-view method that achieves comparable performance (67. 6% NDS & 65. 3% AMOTA) with lidar-based methods.
Ranked #1 on 3D Multi-Object Tracking on nuScenes Camera Only
1 code implementation • CVPR 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Xingping Dong, Xiangyu Zhang, Jianbing Shen
In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT).
2 code implementations • ICCV 2023 • Junjie Yan, Yingfei Liu, Jianjian Sun, Fan Jia, Shuailin Li, Tiancai Wang, Xiangyu Zhang
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection.
4 code implementations • CVPR 2023 • Yuang Zhang, Tiancai Wang, Xiangyu Zhang
In this paper, we propose MOTRv2, a simple yet effective pipeline to bootstrap end-to-end multi-object tracking with a pretrained object detector.
Ranked #2 on Multi-Object Tracking on DanceTrack (using extra training data)
Multi-Object Tracking Multiple Object Tracking with Transformer +2
no code implementations • 15 Nov 2022 • Jinrong Yang, Tiancai Wang, Zheng Ge, Weixin Mao, Xiaoping Li, Xiangyu Zhang
We propose a temporal 2D transformation to bridge the 3D predictions with temporal 2D labels.
3 code implementations • 27 Oct 2022 • Yuang Zhang, Tiancai Wang, Weiyao Lin, Xiangyu Zhang
We present our 1st place solution to the Group Dance Multiple People Tracking Challenge.
Multi-Object Tracking Multiple Object Tracking with Transformer +1
1 code implementation • ICCV 2023 • Yingfei Liu, Junjie Yan, Fan Jia, Shuailin Li, Aqi Gao, Tiancai Wang, Xiangyu Zhang, Jian Sun
More specifically, we extend the 3D position embedding (3D PE) in PETR for temporal modeling.
Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)
1 code implementation • CVPR 2022 • Zhiyuan Liang, Tiancai Wang, Xiangyu Zhang, Jian Sun, Jianbing Shen
The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss.
1 code implementation • 10 Mar 2022 • Yingfei Liu, Tiancai Wang, Xiangyu Zhang, Jian Sun
Object query can perceive the 3D position-aware features and perform end-to-end object detection.
1 code implementation • 9 Dec 2021 • Lufan Ma, Tiancai Wang, Bin Dong, Jiangpeng Yan, Xiu Li, Xiangyu Zhang
Our IFR enjoys several advantages: 1) simulates an infinite-depth refinement network while only requiring parameters of single residual block; 2) produces high-level equilibrium instance features of global receptive field; 3) serves as a plug-and-play general module easily extended to most object recognition frameworks.
1 code implementation • NeurIPS 2021 • Bin Dong, Fangao Zeng, Tiancai Wang, Xiangyu Zhang, Yichen Wei
Moreover, the joint learning of unified query representation can greatly improve the detection performance of DETR.
Ranked #4 on Object Detection on COCO minival (AP75 metric)
2 code implementations • 7 May 2021 • Fangao Zeng, Bin Dong, Yuang Zhang, Tiancai Wang, Xiangyu Zhang, Yichen Wei
Temporal modeling of objects is a key challenge in multiple object tracking (MOT).
Ranked #1 on Multi-Object Tracking on MOT17 (e2e-MOT metric)
Multi-Object Tracking Multiple Object Tracking with Transformer +1
no code implementations • 25 Dec 2020 • Tiancai Wang, Xiangyu Zhang, Jian Sun
In this paper, we present an implicit feature pyramid network (i-FPN) for object detection.
1 code implementation • 3 Dec 2020 • Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang
Object detectors usually achieve promising results with the supervision of complete instance annotations.
1 code implementation • 13 Aug 2020 • Jialian Wu, Liangchen Song, Tiancai Wang, Qian Zhang, Junsong Yuan
In the classification tree, as the number of parent class nodes are significantly less, their logits are less noisy and can be utilized to suppress the wrong/noisy logits existed in the fine-grained class nodes.
Ranked #5 on Few-Shot Object Detection on LVIS v1.0 val
1 code implementation • CVPR 2020 • Tiancai Wang, Tong Yang, Martin Danelljan, Fahad Shahbaz Khan, Xiangyu Zhang, Jian Sun
Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them.
no code implementations • ICCV 2019 • Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Jorma Laaksonen
Our approach outperforms the state-of-the-art on all datasets.
1 code implementation • ICCV 2019 • Tiancai Wang, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
We introduce a single-stage detection framework that combines the advantages of both fine-tuning pretrained models and training from scratch.