no code implementations • 19 Feb 2024 • Avinandan Bose, Simon Shaolei Du, Maryam Fazel
We study the problem of representation transfer in offline Reinforcement Learning (RL), where a learner has access to episodic data from a number of source tasks collected a priori, and aims to learn a shared representation to be used in finding a good policy for a target task.
no code implementations • 3 Feb 2024 • Yiping Wang, Yifang Chen, Wendan Yan, Kevin Jamieson, Simon Shaolei Du
In recent years, data selection has emerged as a core issue for large-scale visual-language model pretraining, especially on noisy web-curated datasets.
1 code implementation • 30 Oct 2023 • Zhaoyi Zhou, Chuning Zhu, Runlong Zhou, Qiwen Cui, Abhishek Gupta, Simon Shaolei Du
Off-policy dynamic programming (DP) techniques such as $Q$-learning have proven to be important in sequential decision-making problems.
no code implementations • 28 Sep 2023 • Jiarui Yao, Simon Shaolei Du
Currently, reinforcement learning (RL), especially deep RL, has received more and more attention in the research area.
1 code implementation • 16 Jun 2023 • Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Arnav M. Das, Gantavya Bhatt, Yinglun Zhu, Jeffrey Bilmes, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak
Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive.
no code implementations • 13 Dec 2021 • Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu
A ubiquitous requirement in many practical reinforcement learning (RL) applications, including medical treatment, recommendation system, education and robotics, is that the deployed policy that actually interacts with the environment cannot change frequently.
no code implementations • 1 Jan 2021 • Shusheng Xu, Simon Shaolei Du, Yi Wu
We initiate the study on deep reinforcement learning problems that require low switching cost, i. e., small number of policy switches during training.
no code implementations • NeurIPS 2017 • Simon Shaolei Du, Jayanth Koushik, Aarti Singh, Barnabas Poczos
We consider the Hypothesis Transfer Learning (HTL) problem where one incorporates a hypothesis trained on the source domain into the learning procedure of the target domain.