no code implementations • 27 May 2024 • Yinda Chen, Haoyuan Shi, Xiaoyu Liu, Te Shi, Ruobing Zhang, Dong Liu, Zhiwei Xiong, Feng Wu
Autoregressive next-token prediction is a standard pretraining method for large-scale language models, but its application to vision tasks is hindered by the non-sequential nature of image data, leading to cumulative errors.
no code implementations • 27 May 2024 • Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, Jinli Suo, Qionghai Dai
In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times.
no code implementations • 24 Mar 2024 • Yinda Chen, Che Liu, Xiaoyu Liu, Rossella Arcucci, Zhiwei Xiong
The burgeoning integration of 3D medical imaging into healthcare has led to a substantial increase in the workload of medical professionals.
no code implementations • 3 Dec 2023 • Che Liu, Cheng Ouyang, Yinda Chen, Cesar César Quilodrán-Casas, Lei Ma, Jie Fu, Yike Guo, Anand Shah, Wenjia Bai, Rossella Arcucci
This underlines T3D's potential in representation learning for 3D medical image analysis.
1 code implementation • 6 Oct 2023 • Yinda Chen, Wei Huang, Shenglong Zhou, Qi Chen, Zhiwei Xiong
By extracting semantic information from unlabeled data, self-supervised methods can improve the performance of downstream tasks, among which the mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 19 Aug 2023 • Yinda Chen, Wei Huang, Xiaoyu Liu, Shiyu Deng, Qi Chen, Zhiwei Xiong
Instance segmentation in electron microscopy (EM) volumes is tough due to complex shapes and sparse annotations.
no code implementations • 7 Jun 2023 • Yinda Chen, Che Liu, Wei Huang, Sibo Cheng, Rossella Arcucci, Zhiwei Xiong
To address these challenges, we present Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation (GTGM), a framework that extends of VLP to 3D medical images without relying on paired textual descriptions.