Search Results for author: Jingzehua Xu

Found 2 papers, 0 papers with code

Context-Former: Stitching via Latent Conditioned Sequence Modeling

no code implementations29 Jan 2024 Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang, Miao Liu, Shuai Zhang

Offline reinforcement learning (RL) algorithms can learn better decision-making compared to behavior policies by stitching the suboptimal trajectories to derive more optimal ones.

D4RL Imitation Learning +2

A dynamical clipping approach with task feedback for Proximal Policy Optimization

no code implementations12 Dec 2023 Ziqi Zhang, Jingzehua Xu, Zifeng Zhuang, Jinxin Liu, Donglin Wang, Shuai Zhang

Different from previous clipping approaches, we consider increasing the maximum cumulative Return in reinforcement learning (RL) tasks as the preference of the RL task, and propose a bi-level proximal policy optimization paradigm, which involves not only optimizing the policy but also dynamically adjusting the clipping bound to reflect the preference of the RL tasks to further elevate the training outcomes and stability of PPO.

Language Modelling Large Language Model +1

Cannot find the paper you are looking for? You can Submit a new open access paper.