no code implementations • 13 Mar 2023 • Gi-Soo Kim, Young Suh Hong, Tae Hoon Lee, Myunghee Cho Paik, Hongsoo Kim
Long-term care service for old people is in great demand in most of the aging societies.
no code implementations • 20 Jan 2023 • Mubarrat Chowdhury, Elkhan Ismayilzada, Khalequzzaman Sayem, Gi-Soo Kim
In this work we propose a new algorithm with a semi-parametric reward model with state-of-the-art complexity of upper bound on regret amongst existing semi-parametric algorithms.
no code implementations • 21 Aug 2022 • Gi-Soo Kim, Hyun-Joon Yang, Jane P. Kim
In this work, we propose a modified actor-critic algorithm which is robust to critic misspecification and derive a novel testing procedure for the actor parameters in this case.
no code implementations • 17 May 2022 • Young-Geun Choi, Gi-Soo Kim, Seunghoon Paik, Myunghee Cho Paik
Non-stationarity is ubiquitous in human behavior and addressing it in the contextual bandits is challenging.
no code implementations • NeurIPS 2021 • Wonyoung Kim, Gi-Soo Kim, Myunghee Cho Paik
A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chosen arm and the rewards of other arms remain missing.
no code implementations • 1 Feb 2021 • Wonyoung Kim, Gi-Soo Kim, Myunghee Cho Paik
A challenging aspect of the bandit problem is that a stochastic reward is observed only for the chosen arm and the rewards of other arms remain missing.
1 code implementation • NeurIPS 2019 • Gi-Soo Kim, Myunghee Cho Paik
Contextual multi-armed bandit algorithms are widely used in sequential decision tasks such as news article recommendation systems, web page ad placement algorithms, and mobile health.
no code implementations • 31 Jan 2019 • Gi-Soo Kim, Myunghee Cho Paik
We prove that the high-probability upper bound of the regret incurred by the proposed algorithm has the same order as the Thompson sampling algorithm for linear reward models.