Search Results for author: Youhe Jiang

Found 2 papers, 2 papers with code

Improving Automatic Parallel Training via Balanced Memory Workload Optimization

1 code implementation5 Jul 2023 Yujie Wang, Youhe Jiang, Xupeng Miao, Fangcheng Fu, Shenhan Zhu, Xiaonan Nie, Yaofeng Tu, Bin Cui

Transformer models have emerged as the leading approach for achieving state-of-the-art performance across various application domains, serving as the foundation for advanced large-scale deep learning (DL) models.

Navigate

Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism

2 code implementations25 Nov 2022 Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi, Xiaonan Nie, Hailin Zhang, Bin Cui

Transformer models have achieved state-of-the-art performance on various domains of applications and gradually becomes the foundations of the advanced large deep learning (DL) models.

Cannot find the paper you are looking for? You can Submit a new open access paper.