no code implementations • 17 Oct 2022 • Lintao Ye, Ming Chi, Ruiquan Liao, Vijay Gupta
Under the assumption that the system is stable or given a known stabilizing controller, we show that our controller enjoys an expected regret that scales as $\sqrt{T}$ with the time horizon $T$ for the case of partially nested information pattern.