Improved Communication Cost in Distributed PageRank Computation – A Theoretical Study

ICML 2020  ·  Siqiang Luo ·

PageRank is a widely used approach for measuring the importance of a node in a graph. Computing PageRank is a fundamental task in numerous applications including web search, machine learning and recommendation systems. The importance of computing PageRanks in a distributed environment has been recognized due to the rapid growth of the graph size in real world. However, only a few previous works can provide a provable complexity and accuracy for distributed PageRank computation. Given a constant $d>0$ and a graph of $n$ nodes and under the well-known congested-clique distributed model, the state-of-the-art approach, Radar-Push, uses $O(\log\log{n}+\log{d})$ communication rounds to approximate the PageRanks within a relative error $O(\frac{1}{\log^d{n}})$. However, Radar-Push entails as large as $O(\log^{2d+3}{n})$ bits of bandwidth (e.g., the communication cost between a pair of nodes per round) in the worst case. In this paper, we provide a new algorithm that uses asymptotically the same communication rounds while significantly improves the bandwidth from $O(\log^{2d+3}{n})$ bits to $O(d\log^3{n})$ bits. To the best of our knowledge, our distributed PageRank algorithm is the first to achieve $o(d\log{n})$ communication rounds with $O(d\log^3{n})$ bits of bandwidth in approximating PageRanks with relative error $O(\frac{1}{\log^d{n}})$ under the congested-clique model.

PDF ICML 2020 PDF
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here