Optimal Stochastic Nonconvex Optimization with Bandit Feedback
In this paper, we analyze the continuous armed bandit problems for nonconvex cost functions under certain smoothness and sublevel set assumptions. We first derive an upper bound on the expected cumulative regret of a simple bin splitting method. We then propose an adaptive bin splitting method, which can significantly improve the performance. Furthermore, a minimax lower bound is derived, which shows that our new adaptive method achieves locally minimax optimal expected cumulative regret.
PDF AbstractTasks
Datasets
Add Datasets
introduced or used in this paper
Results from the Paper
Submit
results from this paper
to get state-of-the-art GitHub badges and help the
community compare results to other papers.
Methods
No methods listed for this paper. Add
relevant methods here