Search Results for author: Dengwang Tang

Found 5 papers, 1 papers with code

Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget

no code implementations • 23 May 2024 • Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

We provide a theoretical upper bound on the mis-identification (of the the support of the best mixed arm) probability and show that it decays exponentially in the budget $N$ and some constants that characterize the hardness of the problem instance.

Paper
Add Code

Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

no code implementations • 17 Oct 2023 • Dengwang Tang, Rahul Jain, Botao Hao, Zheng Wen

In this paper, we study the problem of efficient online reinforcement learning in the infinite horizon setting when there is an offline dataset to start with.

Imitation Learning

Paper
Add Code

Posterior Sampling-based Online Learning for Episodic POMDPs

no code implementations • 16 Oct 2023 • Dengwang Tang, Dongze Ye, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms for POMDPs.

Paper
Add Code

A Novel Point-based Algorithm for Multi-agent Control Using the Common Information Approach

1 code implementation • 10 Apr 2023 • Dengwang Tang, Ashutosh Nayyar, Rahul Jain

The Common Information (CI) approach provides a systematic way to transform a multi-agent stochastic control problem to a single-agent partially observed Markov decision problem (POMDP) called the coordinator's POMDP.

Paper
Code

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale

no code implementations • 20 Mar 2023 • Botao Hao, Rahul Jain, Dengwang Tang, Zheng Wen

We first propose an Informed Posterior Sampling-based RL (iPSRL) algorithm that uses the offline dataset, and information about the expert's behavioral policy used to generate the offline dataset.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.