Search Results for author: Zifeng Zhuang

Found 11 papers, 4 papers with code

DIDI: Diffusion-Guided Diversity for Offline Behavioral Generation

no code implementations23 May 2024 Jinxin Liu, Xinghong Guo, Zifeng Zhuang, Donglin Wang

The goal of DIDI is to learn a diverse set of skills from a mixture of label-free offline data.

D4RL Decision Making

Reinformer: Max-Return Sequence Modeling for Offline RL

1 code implementation14 May 2024 Zifeng Zhuang, Dengyun Peng, Jinxin Liu, Ziqi Zhang, Donglin Wang

In this work, we introduce the concept of max-return sequence modeling which integrates the goal of maximizing returns into existing sequence models.

D4RL Offline RL +1

Context-Former: Stitching via Latent Conditioned Sequence Modeling

no code implementations29 Jan 2024 Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang, Miao Liu, Shuai Zhang

Offline reinforcement learning (RL) algorithms can learn better decision-making compared to behavior policies by stitching the suboptimal trajectories to derive more optimal ones.

D4RL Imitation Learning +2

A dynamical clipping approach with task feedback for Proximal Policy Optimization

no code implementations12 Dec 2023 Ziqi Zhang, Jingzehua Xu, Zifeng Zhuang, Jinxin Liu, Donglin Wang, Shuai Zhang

Different from previous clipping approaches, we consider increasing the maximum cumulative Return in reinforcement learning (RL) tasks as the preference of the RL task, and propose a bi-level proximal policy optimization paradigm, which involves not only optimizing the policy but also dynamically adjusting the clipping bound to reflect the preference of the RL tasks to further elevate the training outcomes and stability of PPO.

Language Modelling Large Language Model +1

RSG: Fast Learning Adaptive Skills for Quadruped Robots by Skill Graph

no code implementations10 Nov 2023 Hongyin Zhang, Diyuan Shi, Zifeng Zhuang, Han Zhao, Zhenyu Wei, Feng Zhao, Sibo Gai, Shangke Lyu, Donglin Wang

Developing robotic intelligent systems that can adapt quickly to unseen wild situations is one of the critical challenges in pursuing autonomous robotics.

Implicit Relations

Improving Offline-to-Online Reinforcement Learning with Q Conditioned State Entropy Exploration

no code implementations7 Oct 2023 Ziqi Zhang, Xiao Xiong, Zifeng Zhuang, Jinxin Liu, Donglin Wang

Studying how to fine-tune offline reinforcement learning (RL) pre-trained policy is profoundly significant for enhancing the sample efficiency of RL algorithms.

Offline RL reinforcement-learning +1

STRAPPER: Preference-based Reinforcement Learning via Self-training Augmentation and Peer Regularization

1 code implementation19 Jul 2023 Yachen Kang, Li He, Jinxin Liu, Zifeng Zhuang, Donglin Wang

Due to the existence of similarity trap, such consistency regularization improperly enhances the consistency possiblity of the model's predictions between segment pairs, and thus reduces the confidence in reward learning, since the augmented distribution does not match with the original one in PbRL.

General Classification reinforcement-learning

Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization

no code implementations NeurIPS 2023 Jinxin Liu, Hongyin Zhang, Zifeng Zhuang, Yachen Kang, Donglin Wang, Bin Wang

Naturally, such a paradigm raises three core questions that are not fully answered by prior non-iterative offline RL counterparts like reward-conditioned policy: (q1) What information should we transfer from the inner-level to the outer-level?

Offline RL Test-time Adaptation

RotoGBML: Towards Out-of-Distribution Generalization for Gradient-Based Meta-Learning

no code implementations12 Mar 2023 Min Zhang, Zifeng Zhuang, Zhitao Wang, Donglin Wang, Wenbin Li

OOD exacerbates inconsistencies in magnitudes and directions of task gradients, which brings challenges for GBML to optimize the meta-knowledge by minimizing the sum of task gradients in each minibatch.

Few-Shot Image Classification Meta-Learning +1

Behavior Proximal Policy Optimization

2 code implementations22 Feb 2023 Zifeng Zhuang, Kun Lei, Jinxin Liu, Donglin Wang, Yilang Guo

Offline reinforcement learning (RL) is a challenging setting where existing off-policy actor-critic methods perform poorly due to the overestimation of out-of-distribution state-action pairs.

D4RL Offline RL +1

Cannot find the paper you are looking for? You can Submit a new open access paper.