1 code implementation • 22 Feb 2024 • Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau
Practical large language model (LLM) services may involve a long system prompt, which specifies the instructions, examples, and knowledge documents of the task and is reused across numerous requests.
3 code implementations • CVPR 2023 • Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu
PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.
1 code implementation • NeurIPS 2023 • Mengcheng Lan, Xinjiang Wang, Yiping Ke, Jiaxing Xu, Litong Feng, Wayne Zhang
Unsupervised semantic segmentation is a challenging task that segments images into semantic groups without manual annotation.
1 code implementation • ICCV 2023 • Yijiang Li, Xinjiang Wang, Lihe Yang, Litong Feng, Wayne Zhang, Ying Gao
Deep co-training has been introduced to semi-supervised segmentation and achieves impressive results, yet few studies have explored the working mechanism behind it.
1 code implementation • ICCV 2023 • Haoqi Wang, Zhizhong Li, Wayne Zhang
We generalize the class vectors found in neural networks to linear subspaces (i. e.~points in the Grassmann manifold) and show that the Grassmann Class Representation (GCR) enables the simultaneous improvement in accuracy and feature transferability.
1 code implementation • 15 Jun 2023 • Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Yixuan Li, Ziwei Liu, Yiran Chen, Hai Li
Out-of-Distribution (OOD) detection is critical for the reliable operation of open-world intelligent systems.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
1 code implementation • CVPR 2023 • Lei Zhu, Xinjiang Wang, Zhanghan Ke, Wayne Zhang, Rynson Lau
As the core building block of vision transformers, attention is a powerful tool to capture long-range dependency.
Ranked #9 on Object Detection on COCO 2017 (mAP metric)
3 code implementations • 13 Oct 2022 • Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu
Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.
1 code implementation • CVPR 2023 • Xinjiang Wang, Xingyi Yang, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, Wayne Zhang
In this study, we dive deep into the inconsistency of pseudo targets in semi-supervised object detection (SSOD).
1 code implementation • CVPR 2023 • Lihe Yang, Lei Qi, Litong Feng, Wayne Zhang, Yinghuan Shi
In this work, we revisit the weak-to-strong consistency framework, popularized by FixMatch from semi-supervised classification, where the prediction of a weakly perturbed image serves as supervision for its strongly perturbed version.
Ranked #1 on Semi-supervised Change Detection on WHU - 10% labeled data (IoU metric)
Semi-supervised Change Detection Semi-supervised Medical Image Segmentation +1
1 code implementation • 22 Jul 2022 • Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu
Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i. e., objects are detected using bounding boxes followed by prediction of their pairwise relationships.
Ranked #5 on Panoptic Scene Graph Generation on PSG Dataset
2 code implementations • CVPR 2022 • Haoqi Wang, Zhizhong Li, Litong Feng, Wayne Zhang
Most of the existing Out-Of-Distribution (OOD) detection algorithms depend on single input source: the feature, the logit, or the softmax probability.
1 code implementation • 14 Dec 2021 • Yi Li, Yiqun Duan, Zhanghui Kuang, Yimin Chen, Wayne Zhang, Xiaomeng Li
So we try to improve WSSS in the aspect of noise mitigation.
Ranked #23 on Weakly-Supervised Semantic Segmentation on COCO 2014 val
2 code implementations • ICCV 2021 • Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang
For these matters, we propose the following designs to push the performance to new state-of-art: (i) Coefficient of Variation Smoothing to smooth the CAMs adaptively; (ii) Proportional Pseudo-mask Generation to project the expanded CAMs to pseudo-mask based on a new metric indicating the importance of each class on each location, instead of the scores trained from binary classifiers.
Ranked #27 on Weakly-Supervised Semantic Segmentation on COCO 2014 val
2 code implementations • ICCV 2021 • Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, Ziwei Liu
The proposed UDG can not only enrich the semantic knowledge of the model by exploiting unlabeled data in an unsupervised manner, but also distinguish ID/OOD samples to enhance ID classification and OOD detection tasks simultaneously.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
1 code implementation • 14 Aug 2021 • Zhanghui Kuang, Hongbin Sun, Zhizhong Li, Xiaoyu Yue, Tsui Hin Lin, Jianyong Chen, Huaqiang Wei, Yiqin Zhu, Tong Gao, Wenwei Zhang, Kai Chen, Wayne Zhang, Dahua Lin
We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.
no code implementations • 13 Aug 2021 • Xiaopeng Yan, Riquan Chen, Litong Feng, Jingkang Yang, Huabin Zheng, Wayne Zhang
In this paper, we propose to label only the most representative samples to expand the labeled set.
1 code implementation • ICCV 2021 • Xiaoyu Yue, Shuyang Sun, Zhanghui Kuang, Meng Wei, Philip Torr, Wayne Zhang, Dahua Lin
As a typical example, the Vision Transformer (ViT) directly applies a pure transformer architecture on image classification, by simply splitting images into tokens with a fixed length, and employing transformers to learn relations between these tokens.
2 code implementations • 2 Aug 2021 • Liyang Liu, Shilong Zhang, Zhanghui Kuang, Aojun Zhou, Jing-Hao Xue, Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang
Our method can be used to prune any structures including those with coupled channels.
Ranked #4 on Network Pruning on ImageNet
no code implementations • 21 May 2021 • Shijie Fang, Yuhang Cao, Xinjiang Wang, Kai Chen, Dahua Lin, Wayne Zhang
The performance of object detection, to a great extent, depends on the availability of large annotated datasets.
8 code implementations • CVPR 2021 2021 • Yiqin Zhu, Jianyong Chen, Lingyu Liang, Zhanghui Kuang, Lianwen Jin, Wayne Zhang
One of the main challenges for arbitrary-shaped text detection is to design a good text instance representation that allows networks to learn diverse text geometry variances.
2 code implementations • 26 Mar 2021 • Hongbin Sun, Zhanghui Kuang, Xiaoyu Yue, Chenhao Lin, Wayne Zhang
In order to roundly evaluate our proposed method as well as boost the future research, we release a new dataset named WildReceipt, which is collected and annotated tailored for the evaluation of key information extraction from document images of unseen templates in the wild.
2 code implementations • ICLR 2021 • Liyang Liu, Yi Li, Zhanghui Kuang, Jing-Hao Xue, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang
Multi-task learning (MTL) has been widely used in representation learning.
1 code implementation • 12 Oct 2020 • Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang
VSGraph-LC starts from anchor selection referring to the semantic similarity between metadata and correct label concepts, and then propagates correct labels from anchors on a visual graph using graph neural network (GNN).
Ranked #9 on Image Classification on WebVision-1000
4 code implementations • ECCV 2020 • Jingkang Yang, Litong Feng, Weirong Chen, Xiaopeng Yan, Huabin Zheng, Ping Luo, Wayne Zhang
Therefore, a simple yet effective WSL framework is proposed.
Ranked #7 on Image Classification on WebVision-1000
3 code implementations • ECCV 2020 • Jianchao Wu, Zhanghui Kuang, Li-Min Wang, Wayne Zhang, Gangshan Wu
In this work, we first empirically find the recognition accuracy is highly correlated with the bounding box size of an actor, and thus higher resolution of actors contributes to better performance.
4 code implementations • ECCV 2020 • Xiaoyu Yue, Zhanghui Kuang, Chenhao Lin, Hongbin Sun, Wayne Zhang
Theoretically, our proposed method, dubbed \emph{RobustScanner}, decodes individual characters with dynamic ratio between context and positional clues, and utilizes more positional ones when the decoding sequences with scarce context, and thus is robust and practical.
1 code implementation • ICML 2020 • Xingyu Xie, Hao Kong, Jianlong Wu, Wayne Zhang, Guangcan Liu, Zhouchen Lin
While successful in many fields, deep neural networks (DNNs) still suffer from some open problems such as bad local minima and unsatisfactory generalization performance.
2 code implementations • CVPR 2020 • Xinjiang Wang, Shilong Zhang, Zhuoran Yu, Litong Feng, Wayne Zhang
Inspired by this, a convolution across the pyramid level is proposed in this study, which is termed pyramid convolution and is a modified 3-D convolution.
Ranked #87 on Object Detection on COCO test-dev
1 code implementation • 4 Feb 2020 • Chenhao Lin, Siwen Wang, Dongqi Xu, Yu Lu, Wayne Zhang
Weakly supervised object detection (WSOD) using only image-level annotations has attracted growing attention over the past few years.
Ranked #10 on Weakly Supervised Object Detection on PASCAL VOC 2007
no code implementations • 20 Sep 2019 • Zhe Huang, Weijiang Yu, Wayne Zhang, Litong Feng, Nong Xiao
Taking the residual result (the coarse de-rained result) between the rainy image sample (i. e. the input data) and the output of coarse stage (i. e. the learnt rain mask) as input, the fine stage continues to de-rain by removing the fine-grained rain streaks (e. g. light rain streaks and water mist) to get a rain-free and well-reconstructed output image via a unified contextual merging sub-network with dense blocks and a merging block.
1 code implementation • 6 Sep 2019 • Guangcan Liu, Wayne Zhang
This paper studies the problem of time series forecasting (TSF) from the perspective of compressed sensing.
1 code implementation • ICCV 2019 • Youjiang Xu, Jiaqi Duan, Zhanghui Kuang, Xiaoyu Yue, Hongbin Sun, Yue Guan, Wayne Zhang
Large geometry (e. g., orientation) variances are the key challenges in the scene text detection.
Ranked #10 on Scene Text Detection on ICDAR 2017 MLT
no code implementations • ICCV 2019 • Zhanghui Kuang, Yiming Gao, Guanbin Li, Ping Luo, Yimin Chen, Liang Lin, Wayne Zhang
To address this issue, we propose a novel Graph Reasoning Network (GRNet) on a Similarity Pyramid, which learns similarities between a query and a gallery cloth by using both global and local representations in multiple scales.
Ranked #4 on Image Retrieval on DeepFashion - Consumer-to-shop (Rank-1 metric)
1 code implementation • CVPR 2019 • Yi Li, Zhanghui Kuang, Yimin Chen, Wayne Zhang
The most informative output neurons in each block are preserved while others are discarded, and thus neurons for multiple scales are competitively and adaptively allocated.
Ranked #705 on Image Classification on ImageNet