no code implementations • 21 Dec 2022 • Qiyu Han, Will Wei Sun, Yichen Zhang
To fill in these gaps, in this work we consider a matrix contextual bandit framework where the true model parameter is a low-rank matrix, and propose a fully online procedure to simultaneously make sequential decision-making and conduct statistical inference.