Search Results for author: Xuedong Shang

Found 5 papers, 1 papers with code

Price of Safety in Linear Best Arm Identification

no code implementations • 15 Sep 2023 • Xuedong Shang, Igor Colin, Merwan Barlier, Hamza Cherkaoui

We introduce the safe best-arm identification framework with linear feedback, where the agent is subject to some stage-wise safety constraint that linearly depends on an unknown parameter vector.

Paper
Add Code

UCB Momentum Q-learning: Correcting the bias without forgetting

1 code implementation • 1 Mar 2021 • Pierre Menard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko

We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process.

Q-Learning

Paper
Code

Stochastic Bandits with Vector Losses: Minimizing $\ell^\infty$-Norm of Relative Losses

no code implementations • 15 Oct 2020 • Xuedong Shang, Han Shao, Jian Qian

We study two goals: (a) finding the arm with the minimum $\ell^\infty$-norm of relative losses with a given confidence level (which refers to fixed-confidence best-arm identification); (b) minimizing the $\ell^\infty$-norm of cumulative relative losses (which refers to regret minimization).

Multi-Armed Bandits Recommendation Systems