no code implementations • 3 Apr 2024 • ShangHua Gao, Ada Fang, Yepeng Huang, Valentina Giunchiglia, Ayush Noori, Jonathan Richard Schwarz, Yasha Ektefaie, Jovana Kondic, Marinka Zitnik
We envision 'AI scientists' as systems capable of skeptical learning and reasoning that empower biomedical research through collaborative agents that integrate machine learning tools with experimental platforms.
2 code implementations • 29 Feb 2024 • ShangHua Gao, Teddy Koker, Owen Queen, Thomas Hartvigsen, Theodoros Tsiligkaridis, Marinka Zitnik
However, current foundation models apply to sequence data but not to time series, which present unique challenges due to the inherent diverse and multidomain time series datasets, diverging task specifications across forecasting, classification and other types of tasks, and the apparent need for task-specialized models.
1 code implementation • 10 Dec 2023 • Yunheng Li, Zhongyu Li, ShangHua Gao, Qilong Wang, Qibin Hou, Ming-Ming Cheng
Effectively modeling discriminative spatio-temporal information is essential for segmenting activities in long action sequences.
1 code implementation • 5 Dec 2023 • Shanshan Zhong, Zhongzhan Huang, ShangHua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou
To this end, we study LLMs on the popular Oogiri game which needs participants to have good creativity and strong associative thinking for responding unexpectedly and humorously to the given image, text, or both, and thus is suitable for LoT study.
no code implementations • 12 Nov 2023 • Yilin Zhao, Xinbin Yuan, ShangHua Gao, Zhijie Lin, Qibin Hou, Jiashi Feng, Daquan Zhou
For MoV, we utilize the text-to-speech (TTS) algorithms with a variety of pre-defined tones and select the most matching one based on the user-provided text description automatically.
1 code implementation • ICCV 2023 • ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
To solve this issue, we propose a Masked Diffusion Transformer (MDT) that introduces a mask latent modeling scheme to explicitly enhance the DPMs' ability to contextual relation learning among object semantic parts in an image.
Ranked #2 on Image Generation on ImageNet 256x256
1 code implementation • 20 Oct 2022 • ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan
In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models.
Ranked #1 on Semantic Segmentation on ImageNet-S
2 code implementations • 14 Jun 2022 • ShangHua Gao, Zhong-Yu Li, Qi Han, Ming-Ming Cheng, Liang Wang
Our search scheme exploits both global search to find the coarse combinations and local search to get the refined receptive field combinations further.
Ranked #2 on Instance Segmentation on COCO 2017 val (AP metric)
1 code implementation • 10 Jun 2022 • Zhong-Yu Li, ShangHua Gao, Ming-Ming Cheng
Specifically, instead of conducting self-supervised learning solely on feature embeddings from multiple views, we utilize the feature self-relations, i. e., spatial/channel self-relations, for self-supervised learning.
Ranked #2 on Semantic Segmentation on ImageNet-S
3 code implementations • 6 Jun 2021 • ShangHua Gao, Zhong-Yu Li, Ming-Hsuan Yang, Ming-Ming Cheng, Junwei Han, Philip Torr
In this work, we propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to help the research progress.
Ranked #1 on Unsupervised Semantic Segmentation on ImageNet-S-300
no code implementations • 1 Jul 2018 • Kai Zhao, Wei Shen, ShangHua Gao, Dandan Li, Ming-Ming Cheng
In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts.