2 code implementations • 9 Apr 2024 • Zhengqing Gao, Xu-Yao Zhang, Cheng-Lin Liu
To address these issues, we propose a simple but effective framework called unified entropy optimization (UniEnt), which is capable of simultaneously adapting to covariate-shifted in-distribution (csID) data and detecting covariate-shifted out-of-distribution (csOOD) data.
no code implementations • 27 Mar 2024 • Wenzhuo LIU, Fei Zhu, Cheng-Lin Liu
Self-supervised learning (SSL) has emerged as an effective paradigm for deriving general representations from vast amounts of unlabeled data.
no code implementations • 27 Mar 2024 • Wenzhuo LIU, Fei Zhu, Cheng-Lin Liu
On the other hand, Semi-IPC learns a prototype for each class with unsupervised regularization, enabling the model to incrementally learn from partially labeled new data while maintaining the knowledge of old classes.
no code implementations • 27 Mar 2024 • Wenzhuo LIU, Fei Zhu, Cheng-Lin Liu
Convolutional Neural Networks (CNNs) have advanced significantly in visual representation learning and recognition.
no code implementations • 11 Mar 2024 • Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
In this paper, we propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
1 code implementation • 7 Mar 2024 • Shijie Ma, Fei Zhu, Zhun Zhong, Xu-Yao Zhang, Cheng-Lin Liu
Generalized Category Discovery (GCD) is a pragmatic and challenging open-world task, which endeavors to cluster unlabeled samples from both novel and old classes, leveraging some labeled data of old classes.
1 code implementation • 5 Mar 2024 • Fei Zhu, Xu-Yao Zhang, Zhen Cheng, Cheng-Lin Liu
Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications.
no code implementations • 4 Mar 2024 • Fei Zhu, Shijie Ma, Zhen Cheng, Xu-Yao Zhang, Zhaoxiang Zhang, Cheng-Lin Liu
This paper aims to provide a comprehensive introduction to the emerging open-world machine learning paradigm, to help researchers build more powerful AI systems in their respective fields, and to promote the development of artificial general intelligence.
no code implementations • 4 Jan 2024 • Haiyang Guo, Fei Zhu, Wenzhuo LIU, Xu-Yao Zhang, Cheng-Lin Liu
On the other hand, our approach utilizes a pre-trained model as the backbone and utilizes LoRA to fine-tune with a tiny amount of parameters when learning new classes.
no code implementations • 25 Nov 2023 • Zhong-Zhi Li, Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu
Existing neural solvers take GPS as a vision-language task but are short in the representation of geometry diagrams that carry rich and complex layout information.
1 code implementation • 22 Nov 2023 • Zhen Cheng, Xu-Yao Zhang, Cheng-Lin Liu
Previous methods mostly take post-training score transformation or hybrid models to ensure low scores on OOD inputs while separating known classes.
no code implementations • 12 Sep 2023 • Jiao Zhang, Xu-Yao Zhang, Cheng-Lin Liu
We advocate that researchers in the DG community refer to dynamic performance of methods for more comprehensive and reliable evaluation.
no code implementations • 4 Aug 2023 • Wenzhuo LIU, Xinjian Wu, Fei Zhu, Mingming Yu, Chuang Wang, Cheng-Lin Liu
This is hard for DNN because it tends to focus on fitting to new classes while ignoring old classes, a phenomenon known as catastrophic forgetting.
no code implementations • 5 Jun 2023 • Wenwen Yu, Chengquan Zhang, Haoyu Cao, Wei Hua, Bohan Li, Huang Chen, MingYu Liu, Mingrui Chen, Jianfeng Kuang, Mengjun Cheng, Yuning Du, Shikun Feng, Xiaoguang Hu, Pengyuan Lyu, Kun Yao, Yuechen Yu, Yuliang Liu, Wanxiang Che, Errui Ding, Cheng-Lin Liu, Jiebo Luo, Shuicheng Yan, Min Zhang, Dimosthenis Karatzas, Xing Sun, Jingdong Wang, Xiang Bai
It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.
1 code implementation • 13 May 2023 • Yuliang Liu, Zhang Li, Biao Yang, Chunyuan Li, XuCheng Yin, Cheng-Lin Liu, Lianwen Jin, Xiang Bai
In this paper, we conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks including Text Recognition, Scene Text-Centric Visual Question Answering (VQA), Document-Oriented VQA, Key Information Extraction (KIE), and Handwritten Mathematical Expression Recognition (HMER).
1 code implementation • CVPR 2023 • Fei Zhu, Zhen Cheng, Xu-Yao Zhang, Cheng-Lin Liu
Reliable confidence estimation for deep neural classifiers is a challenging yet fundamental requirement in high-stakes applications.
1 code implementation • 21 Mar 2023 • Xiu-Chuan Li, Xiaobo Xia, Fei Zhu, Tongliang Liu, Xu-Yao Zhang, Cheng-Lin Liu
Label noise poses a serious threat to deep neural networks (DNNs).
2 code implementations • 16 Mar 2023 • Weixing Chen, Yang Liu, Ce Wang, Jiarui Zhu, Shen Zhao, Guanbin Li, Cheng-Lin Liu, Liang Lin
Medical report generation (MRG) is essential for computer-aided diagnosis and medication guidance, which can relieve the heavy burden of radiologists by automatically generating the corresponding medical reports according to the given radiology image.
1 code implementation • 6 Mar 2023 • Fei Zhu, Zhen Cheng, Xu-Yao Zhang, Cheng-Lin Liu
We investigate this problem and reveal that popular confidence calibration methods often lead to worse confidence separation between correct and incorrect samples, making it more difficult to decide whether to trust a prediction or not.
no code implementations • 2 Mar 2023 • Zhen Cheng, Fei Zhu, Xu-Yao Zhang, Cheng-Lin Liu
Detecting Out-of-distribution (OOD) inputs have been a critical issue for neural networks in the open world.
1 code implementation • 22 Feb 2023 • Ming-Liang Zhang, Fei Yin, Cheng-Lin Liu
Geometry problem solving (GPS) is a high-level mathematical reasoning requiring the capacities of multi-modal fusion and geometric knowledge application.
Ranked #1 on Mathematical Reasoning on PGPS9K
no code implementations • ICCV 2023 • Yunfei Guo, Fei Yin, Xiao-Hui Li, Xudong Yan, Tao Xue, Shuqi Mei, Cheng-Lin Liu
Although previous works on traffic scene understanding have achieved great success, most of them stop at a lowlevel perception stage, such as road segmentation and lane detection, and few concern high-level understanding.
no code implementations • 1 Aug 2022 • Jian-Hui Chen, Cheng-Lin Liu, Zuoren Wang
We further introduce a brain-inspired credit diffusion mechanism, significantly reducing the TDCA-network's parameter complexity, thereby greatly accelerating training without compromising the network's performance. Our experiments involving non-convex function optimization, supervised learning, and reinforcement learning reveal that a well-trained TDCA-network outperforms back-propagation across various settings.
1 code implementation • 19 May 2022 • Ming-Liang Zhang, Fei Yin, Yi-Han Hao, Cheng-Lin Liu
Geometry diagram parsing plays a key role in geometry problem solving, wherein the primitive extraction and relation parsing remain challenging due to the complex layout and between-primitive relationship.
Ranked #1 on Scene Parsing on PGDP5K
1 code implementation • 13 May 2022 • Mei Wang, Weihong Deng, Cheng-Lin Liu
Second, transformation is achieved via swapping the learned textures across domains and a classifier for final classification is trained to predict the labels of the transformed scanned characters.
1 code implementation • 20 Mar 2022 • Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
In this paper, we propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points.
no code implementations • 10 Mar 2022 • Chang Liu, Chun Yang, Hai-Bo Qin, Xiaobin Zhu, Cheng-Lin Liu, Xu-Cheng Yin
Scene text recognition is a popular topic and extensively used in the industry.
no code implementations • 14 Jan 2022 • Yuqi Wang, Xu-Yao Zhang, Cheng-Lin Liu, Zhaoxiang Zhang
Moreover, through experiments we show that discrete language representation has several advantages compared with continuous feature representation, from the aspects of interpretability, generalization, and robustness.
2 code implementations • NeurIPS 2021 • Fei Zhu, Zhen Cheng, Xu-Yao Zhang, Cheng-Lin Liu
Deep learning systems typically suffer from catastrophic forgetting of past knowledge when acquiring new skills continually.
no code implementations • AAAI 2021 • Haoru Tan, Chuang Wang, Sitong Wu, Tie-Qiang Wang, Xu-Yao Zhang, Cheng-Lin Liu
It consists of three parts: a graph neural network to generate a high-level local feature, an attention-based module to normalize the rotational transform, and a global feature matching module based on proximal optimization.
no code implementations • 29 Sep 2021 • Zhen Cheng, Fei Zhu, Xu-Yao Zhang, Cheng-Lin Liu
Comprehensive experiments demonstrate that FSR is effective to alleviate the dominance of larger eigenvalues and improve adversarial robustness on different datasets.
no code implementations • CVPR 2021 • Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
To overcome the lack of character-level annotations, we propose a novel weakly-supervised character center detection module, which only uses word-level annotated real images to generate character-level labels.
1 code implementation • CVPR 2021 • Fei Zhu, Xu-Yao Zhang, Chuang Wang, Fei Yin, Cheng-Lin Liu
Despite the impressive performance in many individual tasks, deep neural networks suffer from catastrophic forgetting when learning new tasks incrementally.
1 code implementation • 14 Apr 2021 • Guo-Wang Xie, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
As camera-based documents are increasingly used, the rectification of distorted document images becomes a need to improve the recognition performance.
no code implementations • 1 Jan 2021 • Fei Zhu, Xu-Yao Zhang, Chuang Wang, Cheng-Lin Liu
In spite of the simplicity, extensive experiments demonstrate that the misclassification detection performance of DNNs can be significantly improved by seeing more generated pseudo-classes during training.
no code implementations • 1 Dec 2020 • Mengbiao Zhao, Wei Feng, Fei Yin, Xu-Yao Zhang, Cheng-Lin Liu
We propose an Expectation-Maximization (EM) based weakly-supervised learning framework to train an accurate arbitrary-shaped text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data.
no code implementations • 12 Jun 2020 • Xu-Yao Zhang, Cheng-Lin Liu, Ching Y. Suen
The accuracies for many pattern recognition tasks have increased rapidly year by year, achieving or even outperforming human performance.
no code implementations • 1 Jul 2019 • Nibal Nayef, Yash Patel, Michal Busta, Pinaki Nath Chowdhury, Dimosthenis Karatzas, Wafa Khlif, Jiri Matas, Umapada Pal, Jean-Christophe Burie, Cheng-Lin Liu, Jean-Marc Ogier
With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense.
Cultural Vocal Bursts Intensity Prediction General Classification +2
no code implementations • CVPR 2019 • Xiaobing Wang, Yingying Jiang, Zhenbo Luo, Cheng-Lin Liu, Hyun-Soo Choi, Sungjin Kim
Here, recurrent neural network based adaptive text region representation is proposed for text region refinement, where a pair of boundary points are predicted each time step until no new points are found.
2 code implementations • 16 Aug 2018 • Zhao Zhong, Zichen Yang, Boyang Deng, Junjie Yan, Wei Wu, Jing Shao, Cheng-Lin Liu
The block-wise generation brings unique advantages: (1) it yields state-of-the-art results in comparison to the hand-crafted networks on image classification, particularly, the best network generated by BlockQNN achieves 2. 35% top-1 error rate on CIFAR-10.
no code implementations • 1 Aug 2018 • Chu Wang, Yan-Ming Zhang, Cheng-Lin Liu
Anomaly detection aims to detect abnormal events by a model of normality.
no code implementations • International Joint Conference on Artificial Intelligence 2018 • Yue Xu, Fei Yin, Zhaoxiang Zhang, Cheng-Lin Liu
Layout analysis is a fundamental process in document image analysis and understanding.
no code implementations • 2 Jun 2018 • Yi-Chao Wu, Fei Yin, Xu-Yao Zhang, Li Liu, Cheng-Lin Liu
Scene text recognition has drawn great attentions in the community of computer vision and artificial intelligence due to its challenges and wide applications.
3 code implementations • CVPR 2018 • Hong-Ming Yang, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
To improve the robustness, we propose a novel learning framework called convolutional prototype learning (CPL).
no code implementations • 6 Sep 2017 • Fei Yin, Yi-Chao Wu, Xu-Yao Zhang, Cheng-Lin Liu
In this paper, we investigate the intrinsic characteristics of text recognition, and inspired by human cognition mechanisms in reading texts, we propose a scene text recognition method with character models on convolutional feature map.
1 code implementation • CVPR 2018 • Zhao Zhong, Junjie Yan, Wei Wu, Jing Shao, Cheng-Lin Liu
Convolutional neural networks have gained a remarkable success in computer vision.
no code implementations • ICCV 2017 • Wenhao He, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
To verify this point of view, we propose a deep direct regression based method for multi-oriented scene text detection.
1 code implementation • 21 Jun 2016 • Xu-Yao Zhang, Fei Yin, Yan-Ming Zhang, Cheng-Lin Liu, Yoshua Bengio
In this paper, we propose a framework by using the recurrent neural network (RNN) as both a discriminative model for recognizing Chinese characters and a generative model for drawing (generating) Chinese characters.
no code implementations • 18 Jun 2016 • Xu-Yao Zhang, Yoshua Bengio, Cheng-Lin Liu
Furthermore, although directMap+convNet can achieve the best results and surpass human-level performance, we show that writer adaptation in this case is still effective.
Data Augmentation Offline Handwritten Chinese Character Recognition
no code implementations • 15 Jun 2016 • Zheng Zhang, Yong Xu, Cheng-Lin Liu
Natural scene character recognition is challenging due to the cluttered background, which is hard to separate from text.
no code implementations • 29 Jan 2016 • Guo-Sen Xie, Xu-Yao Zhang, Shuicheng Yan, Cheng-Lin Liu
Learned from a large-scale training dataset, CNN features are much more discriminative and accurate than the hand-crafted features.
no code implementations • ICCV 2015 • Guo-Sen Xie, Xu-Yao Zhang, Xiangbo Shu, Shuicheng Yan, Cheng-Lin Liu
Feature pooling is an important strategy to achieve high performance in image classification.
1 code implementation • 3 Jul 2012 • Yao Lu, Kai-Zhu Huang, Cheng-Lin Liu
In particular, with high accuracy, our algorithm takes only a few seconds (in a PC) to match two graphs of 1, 000 nodes.
no code implementations • 15 Mar 2012 • Kaizhu Huang, Rong Jin, Zenglin Xu, Cheng-Lin Liu
Most existing distance metric learning methods assume perfect side information that is usually given in pairwise or triplet constraints.