Search Results for author: Shijie Wang

Found 25 papers, 11 papers with code

A Survey on RAG Meets LLMs: Towards Retrieval-Augmented Large Language Models

no code implementations • 10 May 2024 • Yujuan Ding, Wenqi Fan, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, Qing Li

Given the powerful abilities of RAG in providing the latest and helpful auxiliary information, retrieval-augmented large language models have emerged to harness external and authoritative knowledge bases, rather than solely relying on the model's internal knowledge, to augment the generation quality of LLMs.

Information Retrieval Retrieval

Paper
Add Code

Graph Machine Learning in the Era of Large Language Models (LLMs)

no code implementations • 23 Apr 2024 • Wenqi Fan, Shijie Wang, Jiani Huang, Zhikai Chen, Yu Song, Wenzhuo Tang, Haitao Mao, Hui Liu, Xiaorui Liu, Dawei Yin, Qing Li

Meanwhile, graphs, especially knowledge graphs, are rich in reliable factual knowledge, which can be utilized to enhance the reasoning capabilities of LLMs and potentially alleviate their limitations such as hallucinations and the lack of explainability.

Few-Shot Learning Knowledge Graphs +1

Paper
Add Code

Graph Unlearning with Efficient Partial Retraining

no code implementations • 12 Mar 2024 • Jiahao Zhang, Lin Wang, Shijie Wang, Wenqi Fan

Graph Neural Networks (GNNs) have achieved remarkable success in various real-world applications.

Paper
Add Code

Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving

1 code implementation • 12 Mar 2024 • Junda Cheng, Wei Yin, Kaixuan Wang, Xiaozhi Chen, Shijie Wang, Xin Yang

In this work, we propose a new robustness benchmark to evaluate the depth estimation system under various noisy pose settings.

Ranked #1 on Monocular Depth Estimation on DDAD

Autonomous Driving Monocular Depth Estimation

Paper
Code

Converse Barrier Certificates for Finite-time Safety Verification of Continuous-time Perturbed Deterministic Systems

no code implementations • 27 Feb 2024 • Yonghan Li, Chenyu Wu, Taoran Wu, Shijie Wang, Bai Xue

In this paper, we investigate the problem of verifying the finite-time safety of continuous-time perturbed deterministic systems represented by ordinary differential equations in the presence of measurable disturbances.

Paper
Add Code

MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures

no code implementations • 5 Dec 2023 • Zhangyang Xiong, Chenghong Li, Kenkun Liu, Hongjie Liao, Jianqiao Hu, Junyi Zhu, Shuliang Ning, Lingteng Qiu, Chongjie Wang, Shijie Wang, Shuguang Cui, Xiaoguang Han

In this era, the success of large language models and text-to-image models can be attributed to the driving force of large-scale datasets.

Action Recognition Image Generation

Paper
Add Code

Vamos: Versatile Action Models for Video Understanding

1 code implementation • 22 Nov 2023 • Shijie Wang, Qi Zhao, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun

What makes good video representations for video understanding, such as anticipating future activities, or answering video-conditioned questions?

Ranked #3 on Zero-Shot Video Question Answer on EgoSchema (fullset)

Language Modelling Large Language Model +2

Paper
Code

Untargeted Black-box Attacks for Social Recommendations

no code implementations • 13 Nov 2023 • Wenqi Fan, Shijie Wang, Xiao-Yong Wei, Xiaowei Mei, Qing Li

To perform untargeted attacks on social recommender systems, attackers can construct malicious social relationships for fake users to enhance the attack performance.

Decision Making Multi-agent Reinforcement Learning +1

Paper
Add Code

Object-centric Video Representation for Long-term Action Anticipation

1 code implementation • 31 Oct 2023 • Ce Zhang, Changcheng Fu, Shijie Wang, Nakul Agarwal, Kwonjoon Lee, Chiho Choi, Chen Sun

To recognize and predict human-object interactions, we use a Transformer-based neural architecture which allows the "retrieval" of relevant objects for action anticipation at various time scales.

Ranked #3 on Long Term Action Anticipation on Ego4D

Action Anticipation Human-Object Interaction Detection +4

Paper
Code

Qwen Technical Report

2 code implementations • 28 Sep 2023 • Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans.

Ranked #3 on Multi-Label Text Classification on CC3M-TagMask

Language Modelling Large Language Model +2

11,551

Paper
Code

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond

1 code implementation • 24 Aug 2023 • Jinze Bai, Shuai Bai, Shusheng Yang, Shijie Wang, Sinan Tan, Peng Wang, Junyang Lin, Chang Zhou, Jingren Zhou

In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images.

Ranked #3 on Visual Question Answering on MM-Vet

Chart Question Answering Image Captioning +6

3,945

Paper
Code

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

1 code implementation • 31 Jul 2023 • Qi Zhao, Shijie Wang, Ce Zhang, Changcheng Fu, Minh Quan Do, Nakul Agarwal, Kwonjoon Lee, Chen Sun

We propose to formulate the LTA task from two perspectives: a bottom-up approach that predicts the next actions autoregressively by modeling temporal dynamics; and a top-down approach that infers the goal of the actor and plans the needed procedure to accomplish the goal.

Ranked #1 on Long Term Action Anticipation on Ego4D

Action Anticipation counterfactual +1

Paper
Code

A Novel Multi-Agent Deep RL Approach for Traffic Signal Control

no code implementations • 5 Jun 2023 • Shijie Wang, Shangbo Wang

In this paper, we propose a Friend-Deep Q-network (Friend-DQN) approach for multiple traffic signal control in urban networks, which is based on an agent-cooperation scheme.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

2 code implementations • 18 May 2023 • Peng Wang, Shijie Wang, Junyang Lin, Shuai Bai, Xiaohuan Zhou, Jingren Zhou, Xinggang Wang, Chang Zhou

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

Ranked #1 on Semantic Segmentation on ADE20K (using extra training data)

Action Classification AudioCaps +16

6,194

Paper
Code

A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging

1 code implementation • 16 May 2023 • Junyu Wang, Shijie Wang, Ruijie Zhang, Zengqiang Zheng, Wenyu Liu, Xinggang Wang

We present RND-SCI, a novel framework for compressive hyperspectral image (HSI) reconstruction.

Paper
Code

Visual Tuning

no code implementations • 10 May 2023 • Bruce X. B. Yu, Jianlong Chang, Haixin Wang, Lingbo Liu, Shijie Wang, Zhiyu Wang, Junfan Lin, Lingxi Xie, Haojie Li, Zhouchen Lin, Qi Tian, Chang Wen Chen

With the surprising development of pre-trained visual foundation models, visual tuning jumped out of the standard modus operandi that fine-tunes the whole pre-trained model or just the fully connected layer.

Paper
Add Code

A Simple Adaptive Unfolding Network for Hyperspectral Image Reconstruction

1 code implementation • 24 Jan 2023 • Junyu Wang, Shijie Wang, Wenyu Liu, Zengqiang Zheng, Xinggang Wang

We present a simple, efficient, and scalable unfolding network, SAUNet, to simplify the network design with an adaptive alternate optimization framework for hyperspectral image (HSI) reconstruction.

Image Reconstruction

Paper
Code

Open-Set Fine-Grained Retrieval via Prompting Vision-Language Evaluator

no code implementations • CVPR 2023 • Shijie Wang, Jianlong Chang, Haojie Li, Zhihui Wang, Wanli Ouyang, Qi Tian

PLEor could leverage pre-trained CLIP model to infer the discrepancies encompassing both pre-defined and unknown subcategories, called category-specific discrepancies, and transfer them to the backbone network trained in the close-set scenarios.

Knowledge Distillation Retrieval +1

Paper
Add Code

Fine-grained Retrieval Prompt Tuning

no code implementations • 29 Jul 2022 • Shijie Wang, Jianlong Chang, Zhihui Wang, Haojie Li, Wanli Ouyang, Qi Tian

In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation.

Retrieval

Paper
Add Code

Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection

2 code implementations • ICCV 2023 • Yuxin Fang, Shusheng Yang, Shijie Wang, Yixiao Ge, Ying Shan, Xinggang Wang

We present an approach to efficiently and effectively adapt a masked image modeling (MIM) pre-trained vanilla Vision Transformer (ViT) for object detection, which is based on our two novel observations: (i) A MIM pre-trained vanilla ViT encoder can work surprisingly well in the challenging object-level recognition scenario even with randomly sampled partial observations, e. g., only 25% $\sim$ 50% of the input embeddings.

Instance Segmentation Object +2

2,006

Paper
Code

Pose Recognition with Cascade Transformers

2 code implementations • CVPR 2021 • Ke Li, Shijie Wang, Xiang Zhang, Yifan Xu, Weijian Xu, Zhuowen Tu

Here we utilize the encoder-decoder structure in Transformers to perform regression-based person and keypoint detection that is general-purpose and requires less heuristic design compared with the existing approaches.

Decoder Keypoint Detection +1

141

Paper
Code

Category-specific Semantic Coherency Learning for Fine-grained Image Recognition

no code implementations • 12 Oct 2020 • Shijie Wang, Zhihui Wang, Haojie Li, Wanli Ouyang

Existing deep learning based weakly supervised fine-grained image recognition (WFGIR) methods usually pick out the discriminative regions from the high-level feature (HLF) maps directly.

Attribute Fine-Grained Image Recognition

Paper
Add Code

Graph Edit Distance Reward: Learning to Edit Scene Graph

no code implementations • ECCV 2020 • Lichang Chen, Guosheng Lin, Shijie Wang, Qingyao Wu

Scene Graph, as a vital tool to bridge the gap between language domain and image domain, has been widely adopted in the cross-modality task like VQA.

Graph Matching Image Retrieval +2

Paper
Add Code

A New Dataset, Poisson GAN and AquaNet for Underwater Object Grabbing

no code implementations • 3 Mar 2020 • Chongwei Liu, Zhihui Wang, Shijie Wang, Tao Tang, Yulong Tao, Caifei Yang, Haojie Li, Xing Liu, Xin Fan

We also propose a novel Poisson-blending Generative Adversarial Network (Poisson GAN) and an efficient object detection network (AquaNet) to address two common issues within related datasets: the class-imbalance problem and the problem of mass small object, respectively.

4k Generative Adversarial Network +2

Paper
Add Code

Graph-propagation based Correlation Learning for Weakly Supervised Fine-grained Image Classification

no code implementations • AAAI-2020 2020 • Zhihui Wang, Shijie Wang, Haojie Li, Zhi Dou, Jianjun Li

The key of Weakly Supervised Fine-grained Image Classification (WFGIC) is how to pick out the discriminative regions and learn the discriminative features from them.

Ranked #25 on Fine-Grained Image Classification on FGVC Aircraft

Fine-Grained Image Classification General Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.