1 code implementation • 25 May 2023 • Yiliu Wang, Wei Chen, Milan Vojnović
We propose an algorithm and provide a regret bound for problem instances with stochastic arm outcomes according to arbitrary distributions with finite supports.
no code implementations • 22 Jan 2023 • Jialin Yi, Milan Vojnović
For the bandit feedback setting, we propose a near-optimal federated bandit algorithm called FEDEXP3.
no code implementations • 30 Nov 2022 • Jialin Yi, Milan Vojnović
We show that with suitable regularizers and communication protocols, a collaborative multi-agent \emph{follow-the-regularized-leader} (FTRL) algorithm has an individual regret upper bound that matches the lower bound up to a constant factor when the number of arms is large enough relative to degrees of agents in the communication graph.