Search Results for author: Jiandong Jin

Found 7 papers, 6 papers with code

Spatio-Temporal Side Tuning Pre-trained Foundation Models for Video-based Pedestrian Attribute Recognition

2 code implementations27 Apr 2024 Xiao Wang, Qian Zhu, Jiandong Jin, Jun Zhu, Futian Wang, Bo Jiang, YaoWei Wang, Yonghong Tian

Specifically, we formulate the video-based PAR as a vision-language fusion problem and adopt a pre-trained foundation model CLIP to extract the visual features.

Attribute Pedestrian Attribute Recognition +1

Pre-training on High Definition X-ray Images: An Experimental Study

1 code implementation27 Apr 2024 Xiao Wang, Yuehang Li, Wentao Wu, Jiandong Jin, Yao Rong, Bo Jiang, Chuanfu Li, Jin Tang

Existing X-ray based pre-trained vision models are usually conducted on a relatively small-scale dataset (less than 500k samples) with limited resolution (e. g., 224 $\times$ 224).

Decoder Miscellaneous

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

no code implementations1 Mar 2024 Jiandong Jin, Bowen Tang, Mingxuan Ma, Xiao Liu, Yunfei Wang, Qingnan Lai, Jia Yang, Changling Zhou

We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity.

Hallucination Retrieval

Pedestrian Attribute Recognition via CLIP based Prompt Vision-Language Fusion

2 code implementations17 Dec 2023 Xiao Wang, Jiandong Jin, Chenglong Li, Jin Tang, Cheng Zhang, Wei Wang

In this paper, we formulate PAR as a vision-language fusion problem and fully exploit the relations between pedestrian images and attribute labels.

Attribute Contrastive Learning +2

SequencePAR: Understanding Pedestrian Attributes via A Sequence Generation Paradigm

2 code implementations4 Dec 2023 Jiandong Jin, Xiao Wang, Chenglong Li, Lili Huang, Jin Tang

Then, a Transformer decoder is proposed to generate the human attributes by incorporating the visual features and attribute query tokens.

Attribute Decoder +2

Semantic-Aware Frame-Event Fusion based Pattern Recognition via Large Vision-Language Models

1 code implementation30 Nov 2023 Dong Li, Jiandong Jin, Yuhao Zhang, Yanlin Zhong, Yaoyang Wu, Lan Chen, Xiao Wang, Bin Luo

Current methods typically employ backbone networks to individually extract the features of RGB frames and event streams, and subsequently fuse these features for pattern recognition.

Language Modelling Prompt Engineering

Learning CLIP Guided Visual-Text Fusion Transformer for Video-based Pedestrian Attribute Recognition

1 code implementation20 Apr 2023 Jun Zhu, Jiandong Jin, Zihan Yang, Xiaohao Wu, Xiao Wang

The averaged visual tokens and text tokens are concatenated and fed into a fusion Transformer for multi-modal interactive learning.

Attribute Pedestrian Attribute Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.