1 code implementation • 25 Feb 2024 • Shenao Zhang, Sirui Zheng, Shuqi Ke, Zhihan Liu, Wanxin Jin, Jianbo Yuan, Yingxiang Yang, Hongxia Yang, Zhaoran Wang
Specifically, we develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning, particularly when the difference between the ideal policy and the LLM-informed policy is small, which suggests that the initial policy is close to optimal, reducing the need for further exploration.
no code implementations • 23 Aug 2023 • Ni Dang, Tao Shi, Zengjie Zhang, Wanxin Jin, Marion Leibold, Martin Buss
Nevertheless, an important indicator of the driving style, i. e., how an AV reacts to its nearby AVs, is not fully incorporated in the feature design of previous ME-IRL methods.
no code implementations • 29 Sep 2022 • YiXuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu
It is quite challenging to ensure the safety of reinforcement learning (RL) agents in an unknown and stochastic environment under hard constraints that require the system state not to reach certain specified unsafe regions.
1 code implementation • 24 Sep 2022 • Zehui Lu, Wanxin Jin, Shaoshuai Mou, Brian D. O. Anderson
Different from classical techniques for tuning parameters in a controller, we allow tunable parameters appearing in both the system dynamics and the objective functions of each agent.
no code implementations • 18 Jul 2022 • Xuan Wang, Yizhi Zhou, Wanxin Jin
In the inverse problem, where each robot aims to find (learn) its objective (and dynamics) parameters to mimic given coordination demonstrations, D3G proposes a differentiation solver based on Differential Pontryagin's Maximum Principle, which allows each robot to update its parameters in a distributed and coordinated manner.
no code implementations • 25 Dec 2021 • Wanxin Jin, Alp Aydinoglu, Mathew Halm, Michael Posa
This paper investigates the learning, or system identification, of a class of piecewise-affine dynamical systems known as linear complementarity systems (LCSs).
1 code implementation • NeurIPS 2021 • Wanxin Jin, Shaoshuai Mou, George J. Pappas
We propose a Safe Pontryagin Differentiable Programming (Safe PDP) methodology, which establishes a theoretical and algorithmic framework to solve a broad class of safety-critical learning and control tasks -- problems that require the guarantee of safety constraint satisfaction at any stage of the learning and control progress.
1 code implementation • 30 Nov 2020 • Wanxin Jin, Todd D. Murphey, Zehui Lu, Shaoshuai Mou
This paper proposes a novel approach that enables a robot to learn an objective function incrementally from human directional corrections.
no code implementations • 28 Oct 2020 • Wanxin Jin, Zihao Liang, Shaoshuai Mou
This paper proposes an inverse optimal control method which enables a robot to incrementally learn a control objective function from a collection of trajectory segments.
Robotics
2 code implementations • 5 Aug 2020 • Wanxin Jin, Todd D. Murphey, Dana Kulić, Neta Ezer, Shaoshuai Mou
The time stamps of the keyframes can be different from the time of the robot's actual execution.
no code implementations • 15 Jun 2020 • Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
This paper develops an approach to learn a policy of a dynamical system that is guaranteed to be both provably safe and goal-reaching.
1 code implementation • NeurIPS 2020 • Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
This paper develops a Pontryagin Differentiable Programming (PDP) methodology, which establishes a unified framework to solve a broad class of learning and control tasks.
2 code implementations • 21 Mar 2018 • Wanxin Jin, Dana Kulić, Shaoshuai Mou, Sandra Hirche
We handle the problem by proposing the recovery matrix, which establishes a relationship between available observations of the trajectory and weights of given candidate features.
Robotics Systems and Control