1 code implementation • 13 Feb 2024 • Chongyang Gao, Kezhen Chen, Jinmeng Rao, Baochen Sun, Ruibo Liu, Daiyi Peng, Yawen Zhang, Xiaoyuan Guo, Jie Yang, VS Subrahmanian
In this paper, we introduce a novel parameter-efficient MoE method, \textit{\textbf{M}oE-L\textbf{o}RA with \textbf{L}ayer-wise Expert \textbf{A}llocation (MoLA)} for Transformer-based models, where each model layer has the flexibility to employ a varying number of LoRA experts.
no code implementations • 7 Sep 2023 • Jiaying Lu, Jinmeng Rao, Kezhen Chen, Xiaoyuan Guo, Yawen Zhang, Baochen Sun, Carl Yang, Jie Yang
Large Vision-Language Models (LVLMs) offer remarkable benefits for a variety of vision-language tasks.
no code implementations • 19 Aug 2023 • Diji Yang, Kezhen Chen, Jinmeng Rao, Xiaoyuan Guo, Yawen Zhang, Jie Yang, Yi Zhang
Visual language tasks require AI models to comprehend and reason with both visual and textual content.
no code implementations • 31 May 2023 • Xiaoyuan Guo, Kezhen Chen, Jinmeng Rao, Yawen Zhang, Baochen Sun, Jie Yang
To train LOWA, we propose a hybrid vision-language training strategy to learn object detection and recognition with class names as well as attribute information.
no code implementations • 15 Nov 2022 • Yawen Zhang, Michael Hannigan, Qin Lv
In this work, we explore the use of mobile sensing data (i. e., air quality sensors installed on vehicles) to detect pollution hotspots.
no code implementations • 29 Sep 2021 • Pengcheng Li, Yixin Guo, Yawen Zhang, Qinggang Zhou
Mini-batch Stochastic Gradient Descent (SGD) requires workers to halt forward/backward propagations, to wait for gradients synchronized among all workers before the next batch of tasks.
no code implementations • 31 May 2020 • Qinggang Zhou, Yawen Zhang, Pengcheng Li, Xiaoyong Liu, Jun Yang, Runsheng Wang, Ru Huang
The state-of-the-art deep learning algorithms rely on distributed training systems to tackle the increasing sizes of models and training data sets.
no code implementations • 4 Dec 2019 • Songtao Lu, Yawen Zhang, Yunlong Wang, Christina Mack
Federated learning opens a number of research opportunities due to its high communication efficiency in distributed training problems within a star network.
no code implementations • 13 Apr 2017 • Qi Liu, Yawen Zhang, Qin Lv, Li Shang
It is important to retrieve accurate melt pond fraction (MPF) from satellite data for Arctic research.