no code implementations • 27 Feb 2023 • Siddharth Chandak, Ilai Bistritz, Nicholas Bambos
We prove that UECB achieves a regret of $\mathcal{O}(\log(T)+\tau_c\log(\tau_c)+\tau_c\log\log(T))$ for this equilibrium bandit problem where $\tau_c$ is the worst case approximate convergence time to equilibrium.
no code implementations • 5 Feb 2023 • Kyriakos Lotidis, Panayotis Mertikopoulos, Nicholas Bambos
In this paper, we introduce a class of learning dynamics for general quantum games, that we call "follow the quantum regularized leader" (FTQL), in reference to the classical "follow the regularized leader" (FTRL) template for learning in finite games.
no code implementations • 6 Jul 2021 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter W. Glynn, Yinyu Ye
One of the most widely used methods for solving large-scale stochastic optimization problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from parallelizing stochastic gradient descent on distributed computing architectures (possibly) asychronously.
no code implementations • 8 Mar 2021 • Ilai Bistritz, Zhengyuan Zhou, Xi Chen, Nicholas Bambos, Jose Blanchet
Using these bounds, we show that FKM and EXP3 have no weighted-regret even for $d_{t}=O\left(t\log t\right)$.
no code implementations • NeurIPS 2020 • Ilai Bistritz, Nicholas Bambos
At each turn, each player chooses an action and receives a reward that is an unknown function of all the players' actions.
no code implementations • NeurIPS 2020 • Ilai Bistritz, Ariana Mann, Nicholas Bambos
We prove that our algorithm converges with probability 1 to a stationary point where all devices in the communication network distill the entire network's knowledge on the reference data, regardless of their local connections.
no code implementations • NeurIPS 2019 • Ilai Bistritz, Zhengyuan Zhou, Xi Chen, Nicholas Bambos, Jose Blanchet
An adversary chooses the cost of each arm in a bounded interval, and a sequence of feedback delays \left\{ d_{t}\right\} that are unknown to the player.
no code implementations • NeurIPS 2018 • Zhengyuan Zhou, Panayotis Mertikopoulos, Susan Athey, Nicholas Bambos, Peter W. Glynn, Yinyu Ye
We consider a game-theoretical multi-agent learning problem where the feedback information can be lost during the learning process and rewards are given by a broad class of games known as variationally stable games.
no code implementations • ICML 2018 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter Glynn, Yinyu Ye, Li-Jia Li, Li Fei-Fei
One of the most widely used optimization methods for large-scale machine learning problems is distributed asynchronous stochastic gradient descent (DASGD).
no code implementations • NeurIPS 2017 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Stephen Boyd, Peter W. Glynn
In this paper, we examine a class of non-convex stochastic optimization problems which we call variationally coherent, and which properly includes pseudo-/quasiconvex and star-convex optimization problems.
no code implementations • NeurIPS 2017 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Peter W. Glynn, Claire Tomlin
We consider a model of game-theoretic learning based on online mirror descent (OMD) with asynchronous and delayed feedback information.
no code implementations • 18 Jun 2017 • Zhengyuan Zhou, Panayotis Mertikopoulos, Nicholas Bambos, Stephen Boyd, Peter Glynn
In this paper, we examine the convergence of mirror descent in a class of stochastic optimization problems that are not necessarily convex (or even quasi-convex), and which we call variationally coherent.