1 code implementation • 1 Jan 2024 • Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao
We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design.
no code implementations • 25 Aug 2023 • Guangji Bai, Ziyang Yu, Zheng Chai, Yue Cheng, Liang Zhao
It utilizes an offline memory to cache historical information (e. g., node embedding) as an affordable approximation of the exact value and achieves high concurrency.
no code implementations • 31 May 2022 • Zheng Chai, Guangji Bai, Liang Zhao, Yue Cheng
Traditional sampling-based methods accelerate GNN training by dropping edges and nodes, which impairs the graph integrity and model performance.
no code implementations • 16 Apr 2022 • Ahmad Faraz Khan, Yuze Li, Xinran Wang, Sabaat Haroon, Haider Ali, Yue Cheng, Ali R. Butt, Ali Anwar
Federated Learning (FL) is a machine learning approach that addresses privacy and data transfer costs by computing data at the source.
no code implementations • 1 Sep 2021 • Yujing Chen, Zheng Chai, Yue Cheng, Huzefa Rangwala
We propose a novel approach, FedConD, to detect and deal with the concept drift on local devices and minimize the effect on the performance of models in asynchronous FL.
1 code implementation • 20 May 2021 • Junxiang Wang, Hongyi Li, Zheng Chai, Yongchao Wang, Yue Cheng, Liang Zhao
Theoretical convergence to a (quantized) stationary point of the pdADMM-G algorithm and the pdADMM-G-Q algorithm is provided with a sublinear convergence rate $o(1/k)$, where $k$ is the number of iterations.
1 code implementation • 8 May 2021 • Kang Zhao, Hua Xu, Yue Cheng, Xiaoteng Li, Kai Gao
Joint entity and relation extraction is an essential task in information extraction, which aims to extract all relational triples from unstructured text.
Ranked #2 on Relation Extraction on SemEval-2010 Task-8
1 code implementation • 1 Nov 2020 • Junxiang Wang, Zheng Chai, Yue Cheng, Liang Zhao
In this paper, we propose a novel parallel deep learning ADMM framework (pdADMM) to achieve layer parallelism: parameters in each layer of neural networks can be updated independently in parallel.
4 code implementations • 14 Oct 2020 • Benjamin Carver, Jingyuan Zhang, Ao Wang, Ali Anwar, Panruo Wu, Yue Cheng
Serverless computing is increasingly being used for parallel computing, which have traditionally been implemented as stateful applications.
Distributed, Parallel, and Cluster Computing
no code implementations • 12 Oct 2020 • Zheng Chai, Yujing Chen, Ali Anwar, Liang Zhao, Yue Cheng, Huzefa Rangwala
By bridging the synchronous and asynchronous training through tiering, FedAT minimizes the straggler effect with improved convergence speed and test accuracy.
1 code implementation • 9 Sep 2020 • Junxiang Wang, Zheng Chai, Yue Cheng, Liang Zhao
In this paper, we analyze the reason and propose to achieve a compelling trade-off between parallelism and accuracy by a reformulation called Tunable Subnetwork Splitting Method (TSSM), which can tune the decomposition granularity of deep neural networks.
no code implementations • 25 Jan 2020 • Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan, Yue Cheng
To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource and data quantity.
2 code implementations • 14 Oct 2019 • Benjamin Carver, Jingyuan Zhang, Ao Wang, Yue Cheng
The auto-scaling property of serverless computing platforms accommodates short tasks and bursty workloads, while the pay-per-use billing model of serverless computing providers keeps the cost of short tasks low.
Distributed, Parallel, and Cluster Computing
no code implementations • 24 Apr 2019 • Xuan Zhu, Yue Cheng, Jinye Peng, Rongzhi Wang, Mingnan Le, Xin Liu
However, the GAN-based SR methods only use image discriminator to distinguish SR images and high-resolution (HR) images.
1 code implementation • 8 Aug 2018 • Yue Cheng, Zheng Chai, Ali Anwar
Warehouse-scale cloud datacenters co-locate workloads with different and often complementary characteristics for improved resource utilization.
Distributed, Parallel, and Cluster Computing