no code implementations • 30 May 2024 • Zian Su, Xiangzhe Xu, Ziyang Huang, Kaiyuan Zhang, Xiangyu Zhang
Recent advancements in uni-modal code model pre-training, particularly in generative Source Code Foundation Models (SCFMs) and binary understanding models, have laid the groundwork for transfer learning applicable to HOBRE.
no code implementations • 28 May 2024 • Yifan Bai, Dongming Wu, Yingfei Liu, Fan Jia, Weixin Mao, Ziheng Zhang, Yucheng Zhao, Jianbing Shen, Xing Wei, Tiancai Wang, Xiangyu Zhang
Despite its simplicity, Atlas demonstrates superior performance in both 3D detection and ego planning tasks on nuScenes dataset, proving that 3D-tokenized LLM is the key to reliable autonomous driving.
1 code implementation • 26 May 2024 • Tianyu Xie, Yu Zhu, Longlin Yu, Tong Yang, Ziheng Cheng, Shiyue Zhang, Xiangyu Zhang, Cheng Zhang
We propose reflected flow matching (RFM) to train the velocity model in reflected CNFs by matching the conditional velocity fields in a simulation-free manner, similar to the vanilla FM.
1 code implementation • 23 May 2024 • Chenglong Liu, Haoran Wei, Jinyue Chen, Lingyu Kong, Zheng Ge, Zining Zhu, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang
Modern LVLMs still struggle to achieve fine-grained document understanding, such as OCR/translation/caption for regions of interest to the user, tasks that require the context of the entire page, or even multiple pages.
no code implementations • 21 May 2024 • Xiangyu Zhang, Qiquan Zhang, Hexin Liu, Tianyi Xiao, Xinyuan Qian, Beena Ahmed, Eliathamby Ambikairajah, Haizhou Li, Julien Epps
Moreover, experiments demonstrate the effectiveness of BiMamba as an alternative to the self-attention module in Transformer and its derivates, particularly for the semantic-aware task.
no code implementations • 16 May 2024 • Aditya Joshi, Jake Renzella, Pushpak Bhattacharyya, Saurav Jha, Xiangyu Zhang
While neural approaches using deep learning are the state-of-the-art for natural language processing (NLP) today, pre-neural algorithms and approaches still find a place in NLP textbooks and courses of recent years.
1 code implementation • 10 May 2024 • Xinyu Chang, Xiangyu Zhang, Haoruo Zhang, Yulu Ran
This study explores the application of recurrent neural networks to recognize emotions conveyed in music, aiming to enhance music recommendation systems and support therapeutic interventions by tailoring music to fit listeners' emotional states.
1 code implementation • 16 Apr 2024 • Chanwoo Bae, Guanhong Tao, Zhuo Zhang, Xiangyu Zhang
As such, analysts often resort to text search techniques to identify existing malware reports based on the symptoms they observe, exploiting the fact that malware samples share a lot of similarity, especially those from the same origin.
1 code implementation • 16 Apr 2024 • Ke Zhu, Liang Zhao, Zheng Ge, Xiangyu Zhang
We generate chosen and rejected responses with regard to the original and augmented image pairs, and conduct preference alignment with direct preference optimization.
Ranked #34 on Visual Question Answering on MM-Vet
1 code implementation • 15 Apr 2024 • Jinyue Chen, Lingyu Kong, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang
To address this, we propose OneChart: a reliable agent specifically devised for the structural extraction of chart information.
1 code implementation • 1 Apr 2024 • Zhiyuan Cheng, Zhaoyi Liu, Tengda Guo, Shiwei Feng, Dongfang Liu, Mingjie Tang, Xiangyu Zhang
Our attack prototype, named BadPart, is evaluated on both MDE and OFE tasks, utilizing a total of 7 models.
no code implementations • 28 Mar 2024 • Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu Zhang
Autonomous driving progress relies on large-scale annotated datasets.
1 code implementation • 25 Mar 2024 • Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
Backdoor attack poses a significant security threat to Deep Learning applications.
no code implementations • 9 Mar 2024 • Hexin Liu, Xiangyu Zhang, Leibny Paola Garcia, Andy W. H. Khong, Eng Siong Chng, Shinji Watanabe
Performance evaluation using large language models reveals the advantage of the linguistic hint by achieving 14. 1% and 5. 5% relative improvement on test sets of the ASRU and SEAME datasets, respectively.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 19 Feb 2024 • Zian Su, Xiangzhe Xu, Ziyang Huang, Zhuo Zhang, Yapeng Ye, Jianjun Huang, Xiangyu Zhang
Our pre-trained model can improve the SOTAs in these tasks from 53% to 64%, 49% to 60%, and 74% to 94%, respectively.
no code implementations • 17 Feb 2024 • Xiangyu Zhang, Hexin Liu, Kaishuai Xu, Qiquan Zhang, Daijiao Liu, Beena Ahmed, Julien Epps
In addition, this approach is not only valuable for the detection of depression but also represents a new perspective in enhancing the ability of LLMs to comprehend and process speech signals.
no code implementations • 16 Feb 2024 • Xiangyu Zhang, Daijiao Liu, Hexin Liu, Qiquan Zhang, Hanyu Meng, Leibny Paola Garcia, Eng Siong Chng, Lina Yao
Recently, Denoising Diffusion Probabilistic Models (DDPMs) have attained leading performances across a diverse range of generative tasks.
no code implementations • 16 Feb 2024 • Chengpeng Wang, Wuqi Zhang, Zian Su, Xiangzhe Xu, Xiaoheng Xie, Xiangyu Zhang
Dataflow analysis is a powerful code analysis technique that reasons dependencies between program values, offering support for code optimization, program comprehension, and bug detection.
1 code implementation • 8 Feb 2024 • Guangyu Shen, Siyuan Cheng, Kaiyuan Zhang, Guanhong Tao, Shengwei An, Lu Yan, Zhuo Zhang, Shiqing Ma, Xiangyu Zhang
Large Language Models (LLMs) have become prevalent across diverse sectors, transforming human life with their extraordinary reasoning and comprehension abilities.
no code implementations • 25 Jan 2024 • Xiaolong Jin, Zhuo Zhang, Xiangyu Zhang
Given the low cost of our method, we are able to conduct a large scale study regarding LLM alignment issues in different worlds.
no code implementations • 23 Jan 2024 • Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, En Yu, Jianjian Sun, Chunrui Han, Xiangyu Zhang
In Vary-toy, we introduce an improved vision vocabulary, allowing the model to not only possess all features of Vary but also gather more generality.
Ranked #81 on Visual Question Answering on MM-Vet
no code implementations • 17 Jan 2024 • Shuo Wang, Fan Jia, Yingfei Liu, Yucheng Zhao, Zehui Chen, Tiancai Wang, Chi Zhang, Xiangyu Zhang, Feng Zhao
This paper introduces the Stream Query Denoising (SQD) strategy as a novel approach for temporal modeling in high-definition map (HD-map) construction.
no code implementations • NeurIPS 2023 • Di Qi, Tong Yang, Xiangyu Zhang
We hope our approach can provide preliminary understanding of the physical world and help ease future research in 3D object-centric representation learning.
1 code implementation • 21 Dec 2023 • Haochen Wang, Junsong Fan, Yuxi Wang, Kaiyou Song, Tiancai Wang, Xiangyu Zhang, Zhaoxiang Zhang
To empower the model as a teacher, we propose Hard Patches Mining (HPM), predicting patch-wise losses and subsequently determining where to mask.
1 code implementation • 11 Dec 2023 • Haoran Wei, Lingyu Kong, Jinyue Chen, Liang Zhao, Zheng Ge, Jinrong Yang, Jianjian Sun, Chunrui Han, Xiangyu Zhang
Accordingly, we propose Vary, an efficient and effective method to scale up the vision vocabulary of LVLMs.
Ranked #56 on Visual Question Answering on MM-Vet
1 code implementation • 11 Dec 2023 • Hao Tan, Jun Li, Yizhuang Zhou, Jun Wan, Zhen Lei, Xiangyu Zhang
We introduce text supervision to the optimization of prompts, which enables two benefits: 1) releasing the model reliance on the pre-defined category names during inference, thereby enabling more flexible prompt generation; 2) reducing the number of inputs to the text encoder, which decreases GPU memory consumption significantly.
no code implementations • 8 Dec 2023 • Zhuo Zhang, Guangyu Shen, Guanhong Tao, Siyuan Cheng, Xiangyu Zhang
Instead, it exploits the fact that even when an LLM rejects a toxic request, a harmful response often hides deep in the output logits.
no code implementations • 30 Nov 2023 • En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao
Then, FIT requires MLLMs to first predict trajectories of related objects and then reason about potential future events based on them.
Ranked #66 on Visual Question Answering on MM-Vet
no code implementations • 28 Nov 2023 • Yuqing Wen, Yucheng Zhao, Yingfei Liu, Fan Jia, Yanhui Wang, Chong Luo, Chi Zhang, Tiancai Wang, Xiaoyan Sun, Xiangyu Zhang
This work notably propels the field of autonomous driving by effectively augmenting the training dataset used for advanced BEV perception techniques.
1 code implementation • 27 Nov 2023 • Shengwei An, Sheng-Yen Chou, Kaiyuan Zhang, QiuLing Xu, Guanhong Tao, Guangyu Shen, Siyuan Cheng, Shiqing Ma, Pin-Yu Chen, Tsung-Yi Ho, Xiangyu Zhang
Diffusion models (DM) have become state-of-the-art generative models because of their capability to generate high-quality images from noises without adversarial training.
1 code implementation • 27 Nov 2023 • Shuyue Stella Li, Beining Xu, Xiangyu Zhang, Hexin Liu, WenHan Chao, Leibny Paola Garcia
There is a positive correlation between PSR scores and ASR performance, suggesting that phonetic information extracted by monolingual SSL models can be used for downstream tasks in cross-lingual settings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 22 Nov 2023 • Nan Jiang, Chengxiao Wang, Kevin Liu, Xiangzhe Xu, Lin Tan, Xiangyu Zhang
We build Nova$^+$ to further boost Nova using two new pre-training tasks, i. e., optimization generation and optimization level prediction, which are designed to learn binary optimization and align equivalent binaries.
no code implementations • 22 Nov 2023 • Fan Jia, Weixin Mao, Yingfei Liu, Yucheng Zhao, Yuqing Wen, Chi Zhang, Xiangyu Zhang, Tiancai Wang
Based on the vision-action pairs, we construct a general world model based on MLLM and diffusion model for autonomous driving, termed ADriver-I.
1 code implementation • NeurIPS 2023 • Longlin Yu, Tianyu Xie, Yu Zhu, Tong Yang, Xiangyu Zhang, Cheng Zhang
Semi-implicit variational inference (SIVI) has been introduced to expand the analytical variational families by defining expressive semi-implicit distributions in a hierarchical manner.
1 code implementation • 16 Oct 2023 • Ruiqi Wu, Liangyu Chen, Tong Yang, Chunle Guo, Chongyi Li, Xiangyu Zhang
Specifically, we design a first-frame-conditioned pipeline that uses an off-the-shelf text-to-image model for content generation so that our tuned video diffusion model mainly focuses on motion learning.
no code implementations • 8 Oct 2023 • Cheng Zhong, Zhifu Jiang, Xiangyu Zhang, Jikai Chen, Yang Li
Finally, a microgrid simulation model including multiple PV and wind DGs is built and performed in various scenarios compared to the traditional secondary frequency control method.
no code implementations • 29 Sep 2023 • Hexin Liu, Leibny Paola Garcia, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur
Languages usually switch within a multilingual speech signal, especially in a bilingual society.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 27 Sep 2023 • Xiangyu Zhang, Zongqiang Kuang, Zehao Zhang, Fan Huang, Xianfeng Tan
Finally, we evaluate our Cold & Warm Net on public datasets in comparison to models commonly applied in the matching stage and it outperforms other models on all user types.
1 code implementation • 26 Sep 2023 • Ruixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola Garcia, Amir Manbachi
While significant advancements in artificial intelligence (AI) have catalyzed progress across various domains, its full potential in understanding visual perception remains underexplored.
1 code implementation • 20 Sep 2023 • Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi
This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.
Ranked #2 on Visual Question Answering on MMBench (GPT-3.5 score metric)
2 code implementations • 8 Sep 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Yingfei Liu, Xiangyu Zhang, Jianbing Shen
A new trend in the computer vision community is to capture objects of interest following flexible human command represented by a natural language prompt.
1 code implementation • NeurIPS 2023 • Qi Han, Yuxuan Cai, Xiangyu Zhang
Such design enables our architecture with the nice property: maintaining disentangled low-level and semantic information at the end of the network in MIM pre-training.
1 code implementation • 18 Aug 2023 • Xiaohui Jiang, Shuailin Li, Yingfei Liu, Shihao Wang, Fan Jia, Tiancai Wang, Lijin Han, Xiangyu Zhang
Recently 3D object detection from surround-view images has made notable advancements with its low deployment cost.
Ranked #1 on 3D Object Detection on nuScenes Camera Only
no code implementations • 14 Aug 2023 • Xijun Wang, Xiaojie Chu, Chunrui Han, Xiangyu Zhang
This paper presents a module, Spatial Cross-scale Convolution (SCSC), which is verified to be effective in improving both CNNs and Transformers.
no code implementations • 7 Aug 2023 • QiuLing Xu, Pannaga Shivaswamy, Xiangyu Zhang
We subsequently use that metric in an adversarial learning framework to systematically promote disadvantaged items.
1 code implementation • ICCV 2023 • Dongming Wu, Tiancai Wang, Yuang Zhang, Xiangyu Zhang, Jianbing Shen
Referring video object segmentation (RVOS) aims at segmenting an object in a video following human instruction.
Referring Expression Segmentation Referring Video Object Segmentation +2
no code implementations • 18 Jul 2023 • Zhuoling Li, Chunrui Han, Zheng Ge, Jinrong Yang, En Yu, Haoqian Wang, Hengshuang Zhao, Xiangyu Zhang
Besides, GroupLane with ResNet18 still surpasses PersFormer by 4. 9% F1 score, while the inference speed is nearly 7x faster and the FLOPs is only 13. 3% of it.
no code implementations • 18 Jul 2023 • Liang Zhao, En Yu, Zheng Ge, Jinrong Yang, Haoran Wei, HongYu Zhou, Jianjian Sun, Yuang Peng, Runpei Dong, Chunrui Han, Xiangyu Zhang
Based on precise referring instruction, we propose ChatSpot, a unified end-to-end multimodal large language model that supports diverse forms of interactivity including mouse clicks, drag-and-drop, and drawing boxes, which provides a more flexible and seamless interactive experience.
no code implementations • 17 Jul 2023 • Patrick Emami, Xiangyu Zhang, David Biagioni, Ahmed S. Zamzam
In detail, we theoretically demonstrate that the effects of non-stationarity introduced by multiple timescales can be learned by a periodic multi-agent policy.
1 code implementation • 30 May 2023 • Victoria Y. H. Chua, Hexin Liu, Leibny Paola Garcia Perera, Fei Ting Woon, Jinyi Wong, Xiangyu Zhang, Sanjeev Khudanpur, Andy W. H. Khong, Justin Dauwels, Suzy J. Styles
To enhance the reliability and robustness of language identification (LID) and language diarization (LD) systems for heterogeneous populations and scenarios, there is a need for speech processing models to be trained on datasets that feature diverse language registers and speech patterns.
1 code implementation • 27 May 2023 • Weisong Sun, Yuchen Chen, Guanhong Tao, Chunrong Fang, Xiangyu Zhang, Quanjun Zhang, Bin Luo
Neural code search models are hence behind many such engines.
no code implementations • 23 May 2023 • En Yu, Tiancai Wang, Zhuoling Li, Yuang Zhang, Xiangyu Zhang, Wenbing Tao
Although end-to-end multi-object trackers like MOTR enjoy the merits of simplicity, they suffer from the conflict between detection and association seriously, resulting in unsatisfactory convergence dynamics.
no code implementations • 28 Apr 2023 • Zhiyuan Cheng, Hongjun Choi, James Liang, Shiwei Feng, Guanhong Tao, Dongfang Liu, Michael Zuzak, Xiangyu Zhang
We argue that the weakest link of fusion models depends on their most vulnerable modality, and propose an attack framework that targets advanced camera-LiDAR fusion-based 3D object detection models through camera-only adversarial attacks.
no code implementations • 22 Apr 2023 • Shaoteng Liu, Xiangyu Zhang, Tao Hu, Jiaya Jia
In each iteration, the input to VSA is one view (or multiple views) of a 3D object and the output is a synthesized image in another target pose.
1 code implementation • 15 Apr 2023 • Zhi Cai, Songtao Liu, Guodong Wang, Zheng Ge, Xiangyu Zhang, Di Huang
We propose a metric, recall of best-regressed samples, to quantitively evaluate the misalignment problem.
1 code implementation • CVPR 2023 • Shiwei Feng, Guanhong Tao, Siyuan Cheng, Guangyu Shen, Xiangzhe Xu, Yingqi Liu, Kaiyuan Zhang, Shiqing Ma, Xiangyu Zhang
We show the effectiveness of our method on image encoders pre-trained on ImageNet and OpenAI's CLIP 400 million image-text pairs.
1 code implementation • ICCV 2023 • Shihao Wang, Yingfei Liu, Tiancai Wang, Ying Li, Xiangyu Zhang
On the standard nuScenes benchmark, it is the first online multi-view method that achieves comparable performance (67. 6% NDS & 65. 3% AMOTA) with lidar-based methods.
Ranked #1 on 3D Multi-Object Tracking on nuScenes Camera Only
2 code implementations • CVPR 2023 • Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia
Our core insight is to predict objects directly based on sparse voxel features, without relying on hand-crafted proxies.
Ranked #1 on 3D Object Detection on Argoverse2
no code implementations • 10 Mar 2023 • Chunrui Han, Jinrong Yang, Jianjian Sun, Zheng Ge, Runpei Dong, HongYu Zhou, Weixin Mao, Yuang Peng, Xiangyu Zhang
In this paper, we explore an embarrassingly simple long-term recurrent fusion strategy built upon the LSS-based methods and find it already able to enjoy the merits from both sides, i. e., rich long-term information and efficient fusion pipeline.
1 code implementation • CVPR 2023 • Dongming Wu, Wencheng Han, Tiancai Wang, Xingping Dong, Xiangyu Zhang, Jianbing Shen
In this paper, we propose a new and general referring understanding task, termed referring multi-object tracking (RMOT).
3 code implementations • 5 Feb 2023 • Zekun Qi, Runpei Dong, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi
This motivates us to learn 3D representations by sharing the merits of both paradigms, which is non-trivial due to the pattern difference between the two paradigms.
Ranked #1 on Zero-Shot Transfer 3D Point Cloud Classification on ModelNet10 (using extra training data)
1 code implementation • 3 Feb 2023 • Nan Jiang, Thibaud Lutellier, Yiling Lou, Lin Tan, Dan Goldwasser, Xiangyu Zhang
KNOD has two major novelties, including (1) a novel three-stage tree decoder, which directly generates Abstract Syntax Trees of patched code according to the inherent tree structure, and (2) a novel domain-rule distillation, which leverages syntactic and semantic rules and teacher-student distributions to explicitly inject the domain knowledge into the decoding procedure during both the training and inference phases.
1 code implementation • 31 Jan 2023 • Zhiyuan Cheng, James Liang, Guanhong Tao, Dongfang Liu, Xiangyu Zhang
We improve adversarial robustness against physical-world attacks using L0-norm-bounded perturbation in training.
1 code implementation • 16 Jan 2023 • Siyuan Cheng, Guanhong Tao, Yingqi Liu, Shengwei An, Xiangzhe Xu, Shiwei Feng, Guangyu Shen, Kaiyuan Zhang, QiuLing Xu, Shiqing Ma, Xiangyu Zhang
Attack forensics, a critical counter-measure for traditional cyber attacks, is hence of importance for defending model backdoor attacks.
2 code implementations • CVPR 2023 • Zhisheng Zhong, Jiequan Cui, Yibo Yang, Xiaoyang Wu, Xiaojuan Qi, Xiangyu Zhang, Jiaya Jia
Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers.
2 code implementations • ICCV 2023 • Junjie Yan, Yingfei Liu, Jianjian Sun, Fan Jia, Shuailin Li, Tiancai Wang, Xiangyu Zhang
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection.
no code implementations • CVPR 2023 • QiuLing Xu, Guanhong Tao, Jean Honorio, Yingqi Liu, Shengwei An, Guangyu Shen, Siyuan Cheng, Xiangyu Zhang
It trains the clone model from scratch on a very small subset of samples and aims to minimize a cloning loss that denotes the differences between the activations of important neurons across the two models.
1 code implementation • 22 Dec 2022 • Yuxuan Cai, Yizhuang Zhou, Qi Han, Jianjian Sun, Xiangwen Kong, Jun Li, Xiangyu Zhang
Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does.
Ranked #8 on Semantic Segmentation on ADE20K (using extra training data)
no code implementations • 29 Nov 2022 • Guanhong Tao, Zhenting Wang, Siyuan Cheng, Shiqing Ma, Shengwei An, Yingqi Liu, Guangyu Shen, Zhuo Zhang, Yunshu Mao, Xiangyu Zhang
We leverage 20 different types of injected backdoor attacks in the literature as the guidance and study their correspondences in normally trained models, which we call natural backdoor vulnerabilities.
no code implementations • 28 Nov 2022 • Xiangyu Zhang, Zening Wang, Haiyang Zhang, Luxi Yang
In particular, we first formulate the XL-MIMO near-field channel estimation task as a compressed sensing problem using the spatial gridding-based sparsifying dictionary, and then solve the resulting problem by applying the Learning Iterative Shrinkage and Thresholding Algorithm (LISTA).
1 code implementation • 21 Nov 2022 • Hongchao Shu, Ruixing Liang, Zhaoshuo Li, Anna Goodridge, Xiangyu Zhang, Hao Ding, Nimesh Nagururu, Manish Sahu, Francis X. Creighton, Russell H. Taylor, Adnan Munawar, Mathias Unberath
Twin-S tracks and updates the virtual model in real-time given measurements from modern tracking technologies.
2 code implementations • ICCV 2023 • HongYu Zhou, Zheng Ge, Zeming Li, Xiangyu Zhang
This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.
Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)
4 code implementations • CVPR 2023 • Yuang Zhang, Tiancai Wang, Xiangyu Zhang
In this paper, we propose MOTRv2, a simple yet effective pipeline to bootstrap end-to-end multi-object tracking with a pretrained object detector.
Ranked #2 on Multi-Object Tracking on DanceTrack (using extra training data)
Multi-Object Tracking Multiple Object Tracking with Transformer +2
no code implementations • 15 Nov 2022 • Jinrong Yang, Tiancai Wang, Zheng Ge, Weixin Mao, Xiaoping Li, Xiangyu Zhang
We propose a temporal 2D transformation to bridge the 3D predictions with temporal 2D labels.
3 code implementations • 27 Oct 2022 • Yuang Zhang, Tiancai Wang, Weiyao Lin, Xiangyu Zhang
We present our 1st place solution to the Group Dance Multiple People Tracking Challenge.
Multi-Object Tracking Multiple Object Tracking with Transformer +1
1 code implementation • 23 Oct 2022 • Kaiyuan Zhang, Guanhong Tao, QiuLing Xu, Siyuan Cheng, Shengwei An, Yingqi Liu, Shiwei Feng, Guangyu Shen, Pin-Yu Chen, Shiqing Ma, Xiangyu Zhang
In this work, we theoretically analyze the connection among cross-entropy loss, attack success rate, and clean accuracy in this setting.
no code implementations • 21 Oct 2022 • Yu Xuan, Xiangyu Zhang, Shuyue Stella Li, Zihan Shen, Xin Xie, Leibny Paola Garcia, Roberto Togneri
Compared with the state-of-the-art MSF-ANC method, CRLS shows improved performance.
1 code implementation • 18 Oct 2022 • David Biagioni, Xiangyu Zhang, Christiane Adcock, Michael Sinner, Peter Graf, Jennifer King
We demonstrate, in this context, that hybrid methods offer many benefits over both purely model-free and model-based methods as long as certain requirements are met.
no code implementations • 6 Oct 2022 • Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola Garcia
In this work, we propose a highly Portable Quantum Language Model (PQLM) that can easily transmit information to downstream tasks on classical machines.
no code implementations • 26 Sep 2022 • Xiangyu Zhang, Shuyue Stella Li, Zhanhong He, Roberto Togneri, Leibny Paola Garcia
Lyrics recognition is an important task in music processing.
no code implementations • CVPR 2023 • Xuanyang Zhang, Yonggang Li, Xiangyu Zhang, Yongtao Wang, Jian Sun
Differentiable architecture search (DARTS) has significantly promoted the development of NAS techniques because of its high search efficiency and effectiveness but suffers from performance collapse.
Ranked #13 on Neural Architecture Search on NAS-Bench-201, CIFAR-10
no code implementations • CVPR 2023 • Xiangwen Kong, Xiangyu Zhang
Recently, Masked Image Modeling (MIM) achieves great success in self-supervised visual recognition.
1 code implementation • 30 Jul 2022 • Junqiang Huang, Xiangwen Kong, Xiangyu Zhang
We focus on better understanding the critical factors of augmentation-invariant representation learning.
1 code implementation • 11 Jul 2022 • Zhiyuan Cheng, James Liang, Hongjun Choi, Guanhong Tao, Zhiwen Cao, Dongfang Liu, Xiangyu Zhang
Experimental results show that our method can generate stealthy, effective, and robust adversarial patches for different target objects and models and achieves more than 6 meters mean depth estimation error and 93% attack success rate (ASR) in object detection with a patch of 1/9 of the vehicle's rear area.
no code implementations • 7 Jul 2022 • Yangming Zhou, Qichao Ying, Xiangyu Zhang, Zhenxing Qian, Sheng Li, Xinpeng Zhang
We jointly train a 3D-UNet-based watermark embedding network and a decoder that predicts the tampering mask.
2 code implementations • CVPR 2023 • Yukang Chen, Jianhui Liu, Xiangyu Zhang, Xiaojuan Qi, Jiaya Jia
Recent advance in 2D CNNs has revealed that large kernels are important.
no code implementations • 18 Jun 2022 • Guanhong Tao, Yingqi Liu, Siyuan Cheng, Shengwei An, Zhuo Zhang, QiuLing Xu, Guangyu Shen, Xiangyu Zhang
As such, using the samples derived from our attack in adversarial training can harden a model against these backdoor vulnerabilities.
1 code implementation • ICCV 2023 • Yingfei Liu, Junjie Yan, Fan Jia, Shuailin Li, Aqi Gao, Tiancai Wang, Xiangyu Zhang, Jian Sun
More specifically, we extend the 3D position embedding (3D PE) in PETR for temporal modeling.
Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)
1 code implementation • 30 May 2022 • Xin Wen, Bingchen Zhao, Anlin Zheng, Xiangyu Zhang, Xiaojuan Qi
The semantic grouping is performed by assigning pixels to a set of learnable prototypes, which can adapt to each sample by attentive pooling over the feature and form new slots.
Ranked #15 on Unsupervised Semantic Segmentation on COCO-Stuff-27 (Accuracy metric)
1 code implementation • 30 May 2022 • Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding
For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with or better than the recent well-designed models.
1 code implementation • 22 May 2022 • Liqi Yan, Qifan Wang, Yiming Cui, Fuli Feng, Xiaojun Quan, Xiangyu Zhang, Dongfang Liu
Video captioning is a challenging task as it needs to accurately transform visual understanding into natural language description.
2 code implementations • CVPR 2022 • Yukang Chen, Yanwei Li, Xiangyu Zhang, Jian Sun, Jiaya Jia
In this paper, we introduce two new modules to enhance the capability of Sparse CNNs, both are based on making feature sparsity learnable with position-wise importance prediction.
1 code implementation • 11 Apr 2022 • Guocheng Qian, Xuanyang Zhang, Guohao Li, Chen Zhao, Yukang Chen, Xiangyu Zhang, Bernard Ghanem, Jian Sun
TNAS performs a modified bi-level Breadth-First Search in the proposed trees to discover a high-performance architecture.
9 code implementations • 10 Apr 2022 • Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, Jian Sun
Although there have been significant advances in the field of image restoration recently, the system complexity of the state-of-the-art (SOTA) methods is increasing as well, which may hinder the convenient analysis and comparison of methods.
Ranked #1 on Deblurring on MSU BASED
no code implementations • 29 Mar 2022 • Xiangyu Zhang, Peter I. Frazier
Although an average-case-optimal policy can be computed via stochastic dynamic programming, the computation required grows exponentially with the number of arms $N$.
1 code implementation • CVPR 2022 • Zhiyuan Liang, Tiancai Wang, Xiangyu Zhang, Jian Sun, Jianbing Shen
The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss.
2 code implementations • CVPR 2022 • Anlin Zheng, Yuang Zhang, Xiangyu Zhang, Xiaojuan Qi, Jian Sun
Experiments show that our method can significantly boost the performance of query-based detectors in crowded scenes.
Ranked #1 on Object Detection on CrowdHuman
7 code implementations • CVPR 2022 • Xiaohan Ding, Xiangyu Zhang, Yizhuang Zhou, Jungong Han, Guiguang Ding, Jian Sun
We revisit large kernel design in modern convolutional neural networks (CNNs).
Ranked #75 on Image Classification on ImageNet
1 code implementation • 10 Mar 2022 • Yingfei Liu, Tiancai Wang, Xiangyu Zhang, Jian Sun
Object query can perceive the 3D position-aware features and perform end-to-end object detection.
1 code implementation • 8 Mar 2022 • Xiangyu Zhang, Abinet Tesfaye Eseye, Bernard Knueven, Weijia Liu, Matthew Reynolds, Wesley Jones
This paper focuses on the critical load restoration problem in distribution systems following major outages.
no code implementations • 11 Feb 2022 • Guangyu Shen, Yingqi Liu, Guanhong Tao, QiuLing Xu, Zhuo Zhang, Shengwei An, Shiqing Ma, Xiangyu Zhang
We develop a novel optimization method for NLPbackdoor inversion.
2 code implementations • CVPR 2022 • Yin-Yin He, Peizhen Zhang, Xiu-Shen Wei, Xiangyu Zhang, Jian Sun
In this paper, we explore to excavate the confusion matrix, which carries the fine-grained misclassification details, to relieve the pairwise biases, generalizing the coarse one.
no code implementations • 5 Jan 2022 • Weijie Zhao, Xuewu Jiao, Mingqing Hu, Xiaoyun Li, Xiangyu Zhang, Ping Li
In this paper, we propose a hardware-aware training workflow that couples the hardware topology into the algorithm design.
1 code implementation • CVPR 2022 • QiuLing Xu, Guanhong Tao, Xiangyu Zhang
We propose a novel adversarial attack targeting content features in some deep layer, that is, individual neurons in the layer.
no code implementations • CVPR 2022 • Guanhong Tao, Guangyu Shen, Yingqi Liu, Shengwei An, QiuLing Xu, Shiqing Ma, Pan Li, Xiangyu Zhang
A popular trigger inversion method is by optimization.
1 code implementation • CVPR 2022 • Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, Xiangyu Zhang
Our results on the TrojAI competition rounds 2-4, which have patch backdoors and filter backdoors, show that existing scanners may produce hundreds of false positives (i. e., clean models recognized as trojaned), while our technique removes 78-100% of them with a small increase of false negatives by 0-30%, leading to 17-41% overall accuracy improvement.
4 code implementations • CVPR 2022 • Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Jungong Han, Guiguang Ding
Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation.
Ranked #61 on Semantic Segmentation on Cityscapes val
1 code implementation • 19 Dec 2021 • Wenbo Li, Xin Lu, Shengju Qian, Jiangbo Lu, Xiangyu Zhang, Jiaya Jia
Pre-training has marked numerous state of the arts in high-level computer vision, while few attempts have ever been made to investigate how pre-training acts in image processing systems.
Ranked #5 on Image Super-Resolution on Set5 - 2x upscaling (using extra training data)
1 code implementation • 9 Dec 2021 • Lufan Ma, Tiancai Wang, Bin Dong, Jiangpeng Yan, Xiu Li, Xiangyu Zhang
Our IFR enjoys several advantages: 1) simulates an infinite-depth refinement network while only requiring parameters of single residual block; 2) produces high-level equilibrium instance features of global receptive field; 3) serves as a plug-and-play general module easily extended to most object recognition frameworks.
no code implementations • 6 Dec 2021 • Yunxiang Zhang, Xiangyu Zhang, Peter I. Frazier
Recent advances in computationally efficient non-myopic Bayesian optimization (BO) improve query efficiency over traditional myopic methods like expected improvement while only modestly increasing computational cost.
no code implementations • NeurIPS 2021 • Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun
Specifically, 1) we introduce the assumptions that can lead to equilibrium state in SMD, and prove equilibrium can be reached in a linear rate regime under given assumptions; 2) we propose ``angular update" as a substitute for effective learning rate to depict the state of SMD, and derive the theoretical value of angular update in equilibrium state; 3) we verify our assumptions and theoretical results on various large-scale computer vision tasks including ImageNet and MSCOCO with standard settings.
no code implementations • NeurIPS 2021 • Yunxiang Zhang, Xiangyu Zhang, Peter Frazier
Recent advances in computationally efficient non-myopic Bayesian optimization offer improved query efficiency over traditional myopic methods like expected improvement, with only a modest increase in computational cost.
1 code implementation • 10 Nov 2021 • David Biagioni, Xiangyu Zhang, Dylan Wald, Deepthi Vaidhynathan, Rohit Chintala, Jennifer King, Ahmed S. Zamzam
We present the PowerGridworld software package to provide users with a lightweight, modular, and customizable framework for creating power-systems-focused, multi-agent Gym environments that readily integrate with existing training frameworks for reinforcement learning (RL).
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 8 Nov 2021 • David J. Biagioni, Xiangyu Zhang, Peter Graf, Devon Sigler, Wesley Jones
We demonstrate that optimal control for this problem is challenging, requiring more than 8-hour lookahead for MPC with perfect forecasting to attain the minimum cost.
no code implementations • 25 Oct 2021 • Wei Zhou, Xiangyu Zhang, Hongyu Wang, Shenghua Gao, Xin Lou
It is shown that by adding another transformation, the proposed method is able to synthesize high-quality RAW Bayer images with arbitrary size.
1 code implementation • NeurIPS 2021 • Zijian Kang, Peizhen Zhang, Xiangyu Zhang, Jian Sun, Nanning Zheng
Knowledge distillation has shown great success in classification, however, it is still challenging for detection.
no code implementations • 12 Oct 2021 • Qichao Ying, Xiaoxiao Hu, Xiangyu Zhang, Zhenxing Qian, Xinpeng Zhang
At the recipient's side, ACP extracts the watermark from the attacked image, and we conduct feature matching on the original and extracted watermark to locate the position of the crop in the original image plane.
no code implementations • 26 Sep 2021 • Xuanyang Zhang, Xiangyu Zhang, Jian Sun
Knowledge distillation field delicately designs various types of knowledge to shrink the performance gap between compact student and large-scale teacher.
1 code implementation • 23 Sep 2021 • Peizhen Zhang, Zijian Kang, Tong Yang, Xiangyu Zhang, Nanning Zheng, Jian Sun
Instead, we generate an instructive knowledge based only on student representations and regular labels.
2 code implementations • 15 Sep 2021 • Yingming Wang, Xiangyu Zhang, Tong Yang, Jian Sun
Thanks to the query design and the attention variant, the proposed detector that we called Anchor DETR, can achieve better performance and run faster than the DETR with 10$\times$ fewer training epochs.
no code implementations • ICCV 2021 • Yi Wang, Lu Qi, Ying-Cong Chen, Xiangyu Zhang, Jiaya Jia
In this paper, we present a novel approach to synthesize realistic images based on their semantic layouts.
no code implementations • 2 Aug 2021 • Ramin Bashizade, Xiangyu Zhang, Sayan Mukherjee, Alvin R. Lebeck
In this paper, we propose a high-throughput accelerator for Markov Random Field (MRF) inference, a powerful model for representing a wide range of applications, using MCMC with Gibbs sampling.
no code implementations • 25 Jul 2021 • Xiangyu Zhang, Peter I. Frazier
Thus, there is substantial value in understanding the performance of index policies and other policies that can be computed efficiently for large $N$.
no code implementations • 30 Jun 2021 • Yisroel Mirsky, Ambra Demontis, Jaidip Kotak, Ram Shankar, Deng Gelei, Liu Yang, Xiangyu Zhang, Wenke Lee, Yuval Elovici, Battista Biggio
Although offensive AI has been discussed in the past, there is a need to analyze and understand the threat in the context of organizations.
1 code implementation • NeurIPS 2021 • Bin Dong, Fangao Zeng, Tiancai Wang, Xiangyu Zhang, Yichen Wei
Moreover, the joint learning of unified query representation can greatly improve the detection performance of DETR.
Ranked #4 on Object Detection on COCO minival (AP75 metric)
no code implementations • 17 May 2021 • Andrey Ignatov, Kim Byeoung-su, Radu Timofte, Angeline Pouget, Fenglong Song, Cheng Li, Shuai Xiao, Zhongqian Fu, Matteo Maggioni, Yibin Huang, Shen Cheng, Xin Lu, Yifeng Zhou, Liangyu Chen, Donghao Liu, Xiangyu Zhang, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Minsu Kwon, Myungje Lee, Jaeyoon Yoo, Changbeom Kang, Shinjo Wang, Bin Huang, Tianbao Zhou, Shuai Liu, Lei Lei, Chaoyu Feng, Liguang Huang, Zhikun Lei, Feifei Chen
A detailed description of all models developed in the challenge is provided in this paper.
2 code implementations • 7 May 2021 • Fangao Zeng, Bin Dong, Yuang Zhang, Tiancai Wang, Xiangyu Zhang, Yichen Wei
Temporal modeling of objects is a key challenge in multiple object tracking (MOT).
Ranked #1 on Multi-Object Tracking on MOT17 (e2e-MOT metric)
Multi-Object Tracking Multiple Object Tracking with Transformer +1
10 code implementations • 5 May 2021 • Xiaohan Ding, Chunlong Xia, Xiangyu Zhang, Xiaojie Chu, Jungong Han, Guiguang Ding
We propose RepMLP, a multi-layer-perceptron-style neural network building block for image recognition, which is composed of a series of fully-connected (FC) layers.
Ranked #754 on Image Classification on ImageNet
1 code implementation • CVPR 2021 • Liangyu Chen, Tong Yang, Xiangyu Zhang, Wei zhang, Jian Sun
We propose a novel point annotated setting for the weakly semi-supervised object detection task, in which the dataset comprises small fully annotated images and large weakly annotated images by points.
no code implementations • 29 Mar 2021 • Xiangyu Zhang, Zhengming Zhang, Luxi Yang
We model the HUDNs as a heterogeneous graph and train a Graph Neural Network (GNN) to approach this representation function by using semi-supervised learning, in which the loss function is composed of the unsupervised part that helps the GNN approach the optimal representation function and the supervised part that utilizes the previous experience to reduce useless exploration.
2 code implementations • CVPR 2021 • Xiaohan Ding, Xiangyu Zhang, Jungong Han, Guiguang Ding
We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs.
6 code implementations • CVPR 2021 • Qiang Chen, Yingming Wang, Tong Yang, Xiangyu Zhang, Jian Cheng, Jian Sun
From the perspective of optimization, we introduce an alternative way to address the problem instead of adopting the complex feature pyramids - {\em utilizing only one-level feature for detection}.
Ranked #142 on Object Detection on COCO test-dev
no code implementations • 16 Mar 2021 • Yingqi Liu, Guangyu Shen, Guanhong Tao, Zhenting Wang, Shiqing Ma, Xiangyu Zhang
A prominent challenge is hence to distinguish natural features and injected backdoors.
1 code implementation • 9 Feb 2021 • Guangyu Shen, Yingqi Liu, Guanhong Tao, Shengwei An, QiuLing Xu, Siyuan Cheng, Shiqing Ma, Xiangyu Zhang
By iteratively and stochastically selecting the most promising labels for optimization with the guidance of an objective function, we substantially reduce the complexity, allowing to handle models with many classes.
1 code implementation • CVPR 2021 • Xuanyang Zhang, Pengfei Hou, Xiangyu Zhang, Jian Sun
In this paper, we investigate a new variant of neural architecture search (NAS) paradigm -- searching with random labels (RLNAS).
23 code implementations • CVPR 2021 • Xiaohan Ding, Xiangyu Zhang, Ningning Ma, Jungong Han, Guiguang Ding, Jian Sun
We present a simple but powerful architecture of convolutional neural network, which has a VGG-like inference-time body composed of nothing but a stack of 3x3 convolution and ReLU, while the training-time model has a multi-branch topology.
Ranked #44 on Semantic Segmentation on Cityscapes val
no code implementations • 25 Dec 2020 • Tiancai Wang, Xiangyu Zhang, Jian Sun
In this paper, we present an implicit feature pyramid network (i-FPN) for object detection.
2 code implementations • 21 Dec 2020 • Siyuan Cheng, Yingqi Liu, Shiqing Ma, Xiangyu Zhang
Trojan (backdoor) attack is a form of adversarial attack on deep neural networks where the attacker provides victims with a model trained/retrained on malicious data.
1 code implementation • NeurIPS 2020 • Lin Song, Yanwei Li, Zhengkai Jiang, Zeming Li, Xiangyu Zhang, Hongbin Sun, Jian Sun, Nanning Zheng
The Learnable Tree Filter presents a remarkable approach to model structure-preserving relations for semantic segmentation.
1 code implementation • 3 Dec 2020 • Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang
Object detectors usually achieve promising results with the supervision of complete instance annotations.
no code implementations • 30 Oct 2020 • Hongjing Yang, Shude Mao, Weicheng Zang, Xiangyu Zhang
Additionally, we find the asymptotic power-law behaviors in both $\theta_{\rm E}$ and $\pi_{\rm E}$ distributions, and we provide a simple model to understand them.
Astrophysics of Galaxies Earth and Planetary Astrophysics Solar and Stellar Astrophysics
no code implementations • 6 Oct 2020 • Zeming Li, Yuchen Ma, Yukang Chen, Xiangyu Zhang, Jian Sun
In this report, we present our object detection/instance segmentation system, MegDetV2, which works in a two-pass fashion, first to detect instances then to obtain segmentation.
1 code implementation • 5 Oct 2020 • Benjin Zhu, Junqiang Huang, Zeming Li, Xiangyu Zhang, Jian Sun
In this paper, we propose EqCo (Equivalent Rules for Contrastive Learning) to make self-supervised learning irrelevant to the number of negative samples in the contrastive learning framework.
no code implementations • 28 Sep 2020 • Zeyu Fu, Yang Sun, Xiangyu Zhang, Scott Stainton, Shaun Barney, Jeffry Hogg, William Innes, Satnam Dlay
In this paper, we propose a novel multiprediction guided attention network (MPG-Net) for automated retinal layer segmentation in OCT images.
1 code implementation • 17 Sep 2020 • Prem Devanbu, Matthew Dwyer, Sebastian Elbaum, Michael Lowry, Kevin Moran, Denys Poshyvanyk, Baishakhi Ray, Rishabh Singh, Xiangyu Zhang
The intent of this report is to serve as a potential roadmap to guide future work that sits at the intersection of SE & DL.
4 code implementations • CVPR 2021 • Ningning Ma, Xiangyu Zhang, Ming Liu, Jian Sun
We present a simple, effective, and general activation function we term ACON which learns to activate the neurons or not.
no code implementations • 30 Jul 2020 • Junyu Lin, Lei Xu, Yingqi Liu, Xiangyu Zhang
The technique does not require any knowledge of the structure or weights of the target DNN.
2 code implementations • ECCV 2020 • Ningning Ma, Xiangyu Zhang, Jiawei Huang, Jian Sun
WeightNet is easy and memory-conserving to train, on the kernel space instead of the feature space.
6 code implementations • ECCV 2020 • Ningning Ma, Xiangyu Zhang, Jian Sun
We present a conceptually simple but effective funnel activation for image recognition tasks, called Funnel activation (FReLU), that extends ReLU and PReLU to a 2D activation by adding a negligible overhead of spatial condition.
1 code implementation • ECCV 2020 • Miao Hao, Yitao Liu, Xiangyu Zhang, Jian Sun
In this paper we propose a new intermediate supervision method, named LabelEnc, to boost the training of object detection systems.
no code implementations • 4 Jul 2020 • Yun Li, Zechun Liu, Weiqun Wu, Haotian Yao, Xiangyu Zhang, Chi Zhang, Baoqun Yin
In this paper, a simple yet effective network pruning framework is proposed to simultaneously address the problems of pruning indicator, pruning ratio, and efficiency constraint.
no code implementations • 15 Jun 2020 • Ruosi Wan, Zhanxing Zhu, Xiangyu Zhang, Jian Sun
In this work, we comprehensively reveal the learning dynamics of neural network with normalization, weight decay (WD), and SGD (with momentum), named as Spherical Motion Dynamics (SMD).
no code implementations • 12 Jun 2020 • Qiu-Ling Xu, Guanhong Tao, Xiangyu Zhang
We propose a novel technique that can generate natural-looking adversarial examples by bounding the variations induced for internal activation values in some deep layer(s), through a distribution quantile bound and a polynomial barrier loss function.
1 code implementation • 26 May 2020 • Sara Algeri, Xiangyu Zhang
Classical tests of goodness-of-fit aim to validate the conformity of a postulated model to the data under study.
Methodology Statistics Theory Applications Statistics Theory
no code implementations • 18 May 2020 • Zechun Liu, Xiangyu Zhang, Zhiqiang Shen, Zhe Li, Yichen Wei, Kwang-Ting Cheng, Jian Sun
To tackle these three naturally different dimensions, we proposed a general framework by defining pruning as seeking the best pruning vector (i. e., the numerical value of layer-wise channel number, spacial size, depth) and construct a unique mapping from the pruning vector to the pruned network structures.
1 code implementation • ECCV 2020 • Yiming Hu, Yuding Liang, Zichao Guo, Ruosi Wan, Xiangyu Zhang, Yichen Wei, Qingyi Gu, Jian Sun
Comprehensive experiments show that ABS can dramatically enhance existing NAS approaches by providing a promising shrunk search space.
1 code implementation • 26 Apr 2020 • Qiu-Ling Xu, Guanhong Tao, Siyuan Cheng, Xiangyu Zhang
We propose a new adversarial attack to Deep Neural Networks for image classification.
4 code implementations • 26 Apr 2020 • Yukang Chen, Peizhen Zhang, Zeming Li, Yanwei Li, Xiangyu Zhang, Lu Qi, Jian Sun, Jiaya Jia
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.
no code implementations • 14 Apr 2020 • Yichao Wang, Xiangyu Zhang, Zhirong Liu, Zhenhua Dong, Xinhua Feng, Ruiming Tang, Xiuqiang He
To overcome such limitation, our re-ranking model proposes a personalized DPP to model the trade-off between accuracy and diversity for each individual user.
1 code implementation • CVPR 2020 • Yi Wang, Ying-Cong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia
Traditional convolution-based generative adversarial networks synthesize images based on hierarchical local operations, where long-range dependency relation is implicitly modeled with a Markov chain.
1 code implementation • CVPR 2020 • Tiancai Wang, Tong Yang, Martin Danelljan, Fahad Shahbaz Khan, Xiangyu Zhang, Jian Sun
Human-object interaction (HOI) detection strives to localize both the human and an object as well as the identification of complex interactions between them.
no code implementations • CVPR 2021 • Jin Chen, Xijun Wang, Zichao Guo, Xiangyu Zhang, Jian Sun
More gracefully, our DRConv transfers the increasing channel-wise filters to spatial dimension with learnable instructor, which not only improve representation ability of convolution, but also maintains computational cost and the translation-invariance as standard convolution dose.
Ranked #14 on Semantic Segmentation on MCubeS
1 code implementation • CVPR 2020 • Yanwei Li, Lin Song, Yukang Chen, Zeming Li, Xiangyu Zhang, Xingang Wang, Jian Sun
To demonstrate the superiority of the dynamic property, we compare with several static architectures, which can be modeled as special cases in the routing space.
3 code implementations • CVPR 2020 • Xuangeng Chu, Anlin Zheng, Xiangyu Zhang, Jian Sun
We propose a simple yet effective proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.
Ranked #2 on Pedestrian Detection on TJU-Ped-campus
no code implementations • 13 Mar 2020 • Lu Qi, Yi Wang, Yukang Chen, Yingcong Chen, Xiangyu Zhang, Jian Sun, Jiaya Jia
In this paper, we explore the mask representation in instance segmentation with Point-of-Interest (PoI) features.
4 code implementations • ECCV 2020 • Yuanhao Cai, Zhicheng Wang, Zhengxiong Luo, Binyi Yin, Angang Du, Haoqian Wang, Xiangyu Zhang, Xinyu Zhou, Erjin Zhou, Jian Sun
To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations.
Ranked #1 on Keypoint Detection on COCO test-challenge
no code implementations • 5 Mar 2020 • Xiangyu Zhang, Ramin Bashizade, Yicheng Wang, Cheng Lyu, Sayan Mukherjee, Alvin R. Lebeck
Applying the framework to guide design space exploration shows that statistical robustness comparable to floating-point software can be achieved by slightly increasing the bit representation, without floating-point hardware requirements.
1 code implementation • ICLR 2020 • Junjie Yan, Ruosi Wan, Xiangyu Zhang, Wei zhang, Yichen Wei, Jian Sun
Therefore many modified normalization techniques have been proposed, which either fail to restore the performance of BN completely, or have to introduce additional nonlinear operations in inference procedure and increase huge consumption.
no code implementations • 8 Nov 2019 • David Biagioni, Peter Graf, Xiangyu Zhang, Ahmed Zamzam, Kyri Baker, Jennifer King
We propose a novel data-driven method to accelerate the convergence of Alternating Direction Method of Multipliers (ADMM) for solving distributed DC optimal power flow (DC-OPF) where lines are shared between independent network partitions.
no code implementations • 27 Oct 2019 • Xiangyu Zhang, Sayan Mukherjee, Alvin R. Lebeck
Although a common approach is to compare the end-point result quality using community-standard benchmarks and metrics, we claim a probabilistic architecture should provide some measure (or guarantee) of statistical robustness.
no code implementations • 25 Sep 2019 • Yichen Zhu, Xiangyu Zhang, Tong Yang, Jian Sun
We introduce the adaptive resizable networks as dynamic networks, which further improve the performance with less computational cost via data-dependent inference.
no code implementations • 25 Sep 2019 • Shizheng Qin, Yichen Zhu, Pengfei Hou, Xiangyu Zhang, Wenqiang Zhang, Jian Sun
In this paper, we propose a learnable sampling module based on variational auto-encoder (VAE) for neural architecture search (NAS), named as VAENAS, which can be easily embedded into existing weight sharing NAS framework, e. g., one-shot approach and gradient-based approach, and significantly improve the performance of searching results.
no code implementations • 6 Sep 2019 • Yongqiang Tian, Shiqing Ma, Ming Wen, Yepang Liu, Shing-Chi Cheung, Xiangyu Zhang
The corresponding rate for the object detection models is over 8. 5%.
no code implementations • 28 Apr 2019 • Hanchen Xu, Xiao Li, Xiangyu Zhang, Junbo Zhang
In this letter, we address the problem of controlling energy storage systems (ESSs) for arbitrage in real-time electricity markets under price uncertainty.
6 code implementations • ECCV 2020 • Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, Jian Sun
It is easy to train and fast to search.
Ranked #88 on Neural Architecture Search on ImageNet (Accuracy metric)
2 code implementations • NeurIPS 2019 • Yukang Chen, Tong Yang, Xiangyu Zhang, Gaofeng Meng, Xinyu Xiao, Jian Sun
In this work, we present DetNAS to use Neural Architecture Search (NAS) for the design of better backbones for object detection.
2 code implementations • ICCV 2019 • Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Tim Kwang-Ting Cheng, Jian Sun
In this paper, we propose a novel meta learning approach for automatic channel pruning of very deep neural networks.
2 code implementations • CVPR 2019 • Xuecai Hu, Haoyuan Mu, Xiangyu Zhang, Zilei Wang, Tieniu Tan, Jian Sun
In this work, we propose a novel method called Meta-SR to firstly solve super-resolution of arbitrary scale factor (including non-integer scale factors) with a single model.
1 code implementation • 6 Feb 2019 • Derui Wang, Chaoran Li, Sheng Wen, Qing-Long Han, Surya Nepal, Xiangyu Zhang, Yang Xiang
Experimental results demonstrate that the attack effectively stops NMS from filtering redundant bounding boxes.
1 code implementation • NeurIPS 2018 • Guanhong Tao, Shiqing Ma, Yingqi Liu, Xiangyu Zhang
Results show that our technique can achieve 94% detection accuracy for 7 different kinds of attacks with 9. 91% false positives on benign inputs.
4 code implementations • CVPR 2019 • Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, Xiangyu Zhang
Large-scale object detection datasets (e. g., MS-COCO) try to define the ground truth bounding boxes as clear as possible.
Ranked #22 on Object Detection on PASCAL VOC 2007
no code implementations • ECCV 2018 • Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
(1) Recent object detectors like FPN and RetinaNet usually involve extra stages against the task of image classification to handle the objects with various scales.
no code implementations • JMIHI 2018 • Feifei Liu, Chengyu Liu, Lina Zhao, Xiangyu Zhang, Xiaoling Wu, Xiaoyan Xu, Yulin Liu, Caiyun Ma, Shoushui Wei, Zhiqiang He, Jianqing Li, Eddie Ng Yin Kwee
Over the past few decades, methods for classification and detection of rhythm or morphology abnormalities in ECG signals have been widely studied.
35 code implementations • ECCV 2018 • Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun
Datasets, Transforms and Models specific to Computer Vision
Ranked #877 on Image Classification on ImageNet
no code implementations • NeurIPS 2018 • Tong Yang, Xiangyu Zhang, Zeming Li, Wenqiang Zhang, Jian Sun
We propose a novel and flexible anchor mechanism named MetaAnchor for object detection frameworks.
1 code implementation • 30 Apr 2018 • Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, Jian Sun
There are a total of $470K$ human instances from the train and validation subsets, and $~22. 6$ persons per image, with various kinds of occlusions in the dataset.
Ranked #7 on Pedestrian Detection on Caltech (using extra training data)
2 code implementations • 17 Apr 2018 • Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
Due to the gap between the image classification and object detection, we propose DetNet in this paper, which is a novel backbone network specifically designed for object detection.
no code implementations • ECCV 2018 • Zhenli Zhang, Xiangyu Zhang, Chao Peng, Dazhi Cheng, Jian Sun
Modern semantic segmentation frameworks usually combine low-level and high-level features from pre-trained backbone convolutional models to boost performance.
Ranked #4 on Semantic Segmentation on PASCAL VOC 2012 val (using extra training data)
6 code implementations • CVPR 2018 • Chao Peng, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu, Jian Sun
The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design.
5 code implementations • 20 Nov 2017 • Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun
More importantly, simply replacing the backbone with a tiny network (e. g, Xception), our Light-Head R-CNN gets 30. 7 mmAP at 102 FPS on COCO, significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy.
1 code implementation • ICCV 2017 • Yihui He, Xiangyu Zhang, Jian Sun
In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks. Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel selection and least square reconstruction.
37 code implementations • CVPR 2018 • Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun
We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e. g., 10-150 MFLOPs).
Ranked #79 on Person Re-Identification on DukeMTMC-reID
2 code implementations • CVPR 2017 • Chao Peng, Xiangyu Zhang, Gang Yu, Guiming Luo, Jian Sun
One of recent trends [30, 31, 14] in network architec- ture design is stacking small filters (e. g., 1x1 or 3x3) in the entire network because the stacked small filters is more ef- ficient than a large kernel, given the same computational complexity.
Ranked #8 on Semantic Segmentation on PASCAL VOC 2012 val
55 code implementations • 16 Mar 2016 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Deep residual networks have emerged as a family of extremely deep architectures showing compelling accuracy and nice convergence behaviors.
Ranked #17 on Image Classification on Kuzushiji-MNIST
471 code implementations • CVPR 2016 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Ranked #1 on Image Classification on cifar100
no code implementations • 26 May 2015 • Xiangyu Zhang, Jianhua Zou, Kaiming He, Jian Sun
This paper aims to accelerate the test-time computation of convolutional neural networks (CNNs), especially very deep CNNs that have substantially impacted the computer vision community.
no code implementations • 23 Apr 2015 • Shaoqing Ren, Kaiming He, Ross Girshick, Xiangyu Zhang, Jian Sun
We discover that aside from deep feature maps, a deep and convolutional per-region classifier is of particular importance for object detection, whereas latest superior image classification models (such as ResNets and GoogLeNets) do not directly lead to good detection accuracy without using such a per-region classifier.
16 code implementations • ICCV 2015 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
In this work, we study rectifier neural networks for image classification from two aspects.
no code implementations • CVPR 2015 • Xiangyu Zhang, Jianhua Zou, Xiang Ming, Kaiming He, Jian Sun
This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs).
14 code implementations • 18 Jun 2014 • Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale.
Ranked #26 on Object Detection on PASCAL VOC 2007