Paper tables with annotated results for Thresholding Bandit with Optimal Aggregate Regret

Paper

Thresholding Bandit with Optimal Aggregate Regret

We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold $\theta$, with a fixed budget of $T$ trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Thresholding Bandit with Optimal Aggregate Regret

Reader Guidelines

Editor Guidelines