Search Results for author: Yi-Chen Li

Found 6 papers, 2 papers with code

BWArea Model: Learning World Model, Inverse Dynamics, and Policy for Controllable Language Generation

no code implementations27 May 2024 Chengxing Jia, Pengyuan Wang, Ziniu Li, Yi-Chen Li, Zhilong Zhang, Nan Tang, Yang Yu

In a similar vein, our proposed system, the BWArea model, conceptualizes language generation as a decision-making task.

Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning

no code implementations27 May 2024 Haoxin Lin, Yu-Yan Xu, Yihao Sun, Zhilong Zhang, Yi-Chen Li, Chengxing Jia, Junyin Ye, Jiaji Zhang, Yang Yu

In the online setting, ADMPO-ON demonstrates improved sample efficiency compared to previous state-of-the-art methods.

Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation

1 code implementation12 Mar 2024 Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chen-Xiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu

Specifically, the objective of adversarial data augmentation is not merely to generate data analogous to offline data distribution; instead, it aims to create adversarial examples designed to confound learned task representations and lead to incorrect task identification.

Contrastive Learning Data Augmentation +3

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics

no code implementations17 Feb 2024 Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu

DORA incorporates an information bottleneck principle that maximizes mutual information between the dynamics encoding and the environmental data, while minimizing mutual information between the dynamics encoding and the actions of the behavior policy.

Representation Learning

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

2 code implementations11 Jun 2023 Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu

A common taxonomy of existing offline RL works is policy regularization, which typically constrains the learned policy by distribution or support of the behavior policy.

Offline RL reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.