Search Results for author: Lianghui Zhu

Found 7 papers, 6 papers with code

Task-oriented Embedding Counts: Heuristic Clustering-driven Feature Fine-tuning for Whole Slide Image Classification

no code implementations2 Jun 2024 Xuenian Wang, Shanshan Shi, Renao Yan, Qiehe Sun, Lianghui Zhu, Tian Guan, Yonghong He

To address this issue, we propose a heuristic clustering-driven feature fine-tuning method (HC-FT) to enhance the performance of multiple instance learning by providing purified positive and hard negative samples.

Clustering Image Classification +1

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention

1 code implementation28 May 2024 Lianghui Zhu, Zilong Huang, Bencheng Liao, Jun Hao Liew, Hanshu Yan, Jiashi Feng, Xinggang Wang

In this paper, we aim to leverage the long sequence modeling capability of Gated Linear Attention (GLA) Transformers, expanding its applicability to diffusion models.

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

1 code implementation28 May 2024 Bencheng Liao, Xinggang Wang, Lianghui Zhu, Qian Zhang, Chang Huang

Recently, linear complexity sequence modeling networks have achieved modeling capabilities similar to Vision Transformers on a variety of computer vision tasks, while using fewer FLOPs and less memory.

Representation Learning

WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition

1 code implementation22 Feb 2024 Lianghui Zhu, Junwei Zhou, Yan Liu, Xin Hao, Wenyu Liu, Xinggang Wang

Weakly supervised visual recognition using inexact supervision is a critical yet challenging learning problem.

object-detection Segmentation +2

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

8 code implementations17 Jan 2024 Lianghui Zhu, Bencheng Liao, Qian Zhang, Xinlong Wang, Wenyu Liu, Xinggang Wang

The results demonstrate that Vim is capable of overcoming the computation & memory constraints on performing Transformer-style understanding for high-resolution images and it has great potential to be the next-generation backbone for vision foundation models.

Image Classification object-detection +4

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

1 code implementation26 Oct 2023 Lianghui Zhu, Xinggang Wang, Xinlong Wang

To address this problem, we propose to fine-tune LLMs as scalable judges (JudgeLM) to evaluate LLMs efficiently and effectively in open-ended benchmarks.

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation

1 code implementation3 Apr 2023 Lianghui Zhu, Yingyue Li, Jiemin Fang, Yan Liu, Hao Xin, Wenyu Liu, Xinggang Wang

Thus a novel weight-based method is proposed to end-to-end estimate the importance of attention heads, while the self-attention maps are adaptively fused for high-quality CAM results that tend to have more complete objects.

Decoder Weakly-supervised Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.