Search Results for author: Zifeng Zhuang

Found 11 papers, 4 papers with code

DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation

no code implementations • 23 May 2024 • Jinxin Liu, Xinghong Guo, Zifeng Zhuang, Donglin Wang

The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data.

Paper
Add Code

Reinformer: Max-Return Sequence Modeling for Offline RL

1 code implementation • 14 May 2024 • Zifeng Zhuang, Dengyun Peng, Jinxin Liu, Ziqi Zhang, Donglin Wang

In this work, we introduce the concept of max-return sequence modeling which integrates the goal of maximizing returns into existing sequence models.

D4RL Offline RL +1

Paper
Code

Context-Former: Stitching via Latent Conditioned Sequence Modeling

no code implementations • 29 Jan 2024 • Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang, Miao Liu, Shuai Zhang

Offline reinforcement learning (RL) algorithms can learn better decision-making compared to behavior policies by stitching the suboptimal trajectories to derive more optimal ones.

D4RL Imitation Learning +2

Paper
Add Code

A dynamical clipping approach with task feedback for Proximal Policy Optimization

no code implementations • 12 Dec 2023 • Ziqi Zhang, Jingzehua Xu, Zifeng Zhuang, Jinxin Liu, Donglin Wang, Shuai Zhang

Different from previous clipping approaches, we consider increasing the maximum cumulative Return in reinforcement learning (RL) tasks as the preference of the RL task, and propose a bi-level proximal policy optimization paradigm, which involves not only optimizing the policy but also dynamically adjusting the clipping bound to reflect the preference of the RL tasks to further elevate the training outcomes and stability of PPO.

Language Modelling Large Language Model +1

Paper
Add Code

RSG: Fast Learning Adaptive Skills for Quadruped Robots by Skill Graph

no code implementations • 10 Nov 2023 • Hongyin Zhang, Diyuan Shi, Zifeng Zhuang, Han Zhao, Zhenyu Wei, Feng Zhao, Sibo Gai, Shangke Lyu, Donglin Wang

Developing robotic intelligent systems that can adapt quickly to unseen wild situations is one of the critical challenges in pursuing autonomous robotics.

Implicit Relations

Paper
Add Code

Improving Offline-to-Online Reinforcement Learning with Q Conditioned State Entropy Exploration

no code implementations • 7 Oct 2023 • Ziqi Zhang, Xiao Xiong, Zifeng Zhuang, Jinxin Liu, Donglin Wang

Studying how to fine-tune offline reinforcement learning (RL) pre-trained policy is profoundly significant for enhancing the sample efficiency of RL algorithms.

Offline RL reinforcement-learning +1

Paper
Add Code

STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization

1 code implementation • 19 Jul 2023 • Yachen Kang, Li He, Jinxin Liu, Zifeng Zhuang, Donglin Wang

Due to the existence of similarity trap, such consistency regularization improperly enhances the consistency possiblity of the model's predictions between segment pairs, and thus reduces the confidence in reward learning, since the augmented distribution does not match with the original one in PbRL.

General Classification reinforcement-learning

103

Paper
Code

Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization

no code implementations • NeurIPS 2023 • Jinxin Liu, Hongyin Zhang, Zifeng Zhuang, Yachen Kang, Donglin Wang, Bin Wang

Naturally, such a paradigm raises three core questions that are not fully answered by prior non-iterative offline RL counterparts like reward-conditioned policy: (q1) What information should we transfer from the inner-level to the outer-level?

Offline RL Test-time Adaptation

Paper
Add Code

Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning

1 code implementation • 22 Jun 2023 • Jinxin Liu, Ziqi Zhang, Zhenyu Wei, Zifeng Zhuang, Yachen Kang, Sibo Gai, Donglin Wang

Offline reinforcement learning (RL) aims to learn a policy using only pre-collected and fixed data.

Data Augmentation Offline RL +2

Paper
Code

RotoGBML: Towards Out-of-Distribution Generalization for Gradient-Based Meta-Learning

no code implementations • 12 Mar 2023 • Min Zhang, Zifeng Zhuang, Zhitao Wang, Donglin Wang, Wenbin Li

OOD exacerbates inconsistencies in magnitudes and directions of task gradients, which brings challenges for GBML to optimize the meta-knowledge by minimizing the sum of task gradients in each minibatch.

Few-Shot Image Classification Meta-Learning +1

Paper
Add Code

Behavior Proximal Policy Optimization

2 code implementations • 22 Feb 2023 • Zifeng Zhuang, Kun Lei, Jinxin Liu, Donglin Wang, Yilang Guo

Offline reinforcement learning (RL) is a challenging setting where existing off-policy actor-critic methods perform poorly due to the overestimation of out-of-distribution state-action pairs.

D4RL Offline RL +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.