no code implementations • 3 Jun 2024 • Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Shu Wei, Binghong Wu, Lei Liao, YongJie Ye, Hao liu, Houqiang Li, Can Huang
In this mechanism, all the involved diverse visual table understanding (VTU) tasks and multi-source visual embeddings are abstracted as concepts.
no code implementations • 20 May 2024 • Jingqun Tang, Qi Liu, YongJie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao liu, Xiang Bai, Can Huang
To our knowledge, MTVQA is the first multilingual TEC-VQA benchmark to provide human expert annotations for text-centric scenarios.
1 code implementation • 25 Mar 2024 • Han Wang, Yanjie Wang, YongJie Ye, Yuxiang Nie, Can Huang
Multi-modal Large Language Models (MLLMs) have demonstrated their ability to perceive objects in still images, but their application in video-related tasks, such as object tracking, remains understudied.
Ranked #1 on Zero-Shot Single Object Tracking on LaSOT
no code implementations • 29 Nov 2019 • YongJie Ye, Jingjing Zhang, Weigang Wu, Xiapu Luo, Jiannong Cao
In this paper, we design and develop a novel off-chain system to shorten the routing path for the payment network.
Cryptography and Security