Search Results for author: Weihao Cui

Found 3 papers, 0 papers with code

The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving

no code implementations18 May 2024 Pai Zeng, Zhenyu Ning, Jieru Zhao, Weihao Cui, Mengwei Xu, Liwei Guo, Xusheng Chen, Yizhou Shan

We survey the large language model (LLM) serving area to understand the intricate dynamics between cost-efficiency and accuracy, which is magnified by the growing need for longer contextual understanding when deploying models at a massive scale.

Language Modelling Large Language Model

A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters

no code implementations24 Mar 2024 Chunyu Xue, Weihao Cui, Han Zhao, Quan Chen, Shulai Zhang, Pengyu Yang, Jing Yang, Shaobo Li, Minyi Guo

The exponentially enlarged scheduling space and ever-changing optimal parallelism plan from adaptive parallelism together result in the contradiction between low-overhead and accurate performance data acquisition for efficient cluster scheduling.

Scheduling

AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs

no code implementations27 May 2023 Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo

Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features.

Cannot find the paper you are looking for? You can Submit a new open access paper.