Search Results for author: Yuhao Ding

Found 12 papers, 2 papers with code

Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning

no code implementations26 May 2024 Shangding Gu, Bilgehan Sel, Yuhao Ding, Lu Wang, QIngwei Lin, Alois Knoll, Ming Jin

In numerous reinforcement learning (RL) problems involving safety-critical systems, a key challenge lies in balancing multiple objectives while simultaneously meeting all stringent safety constraints.

A CMDP-within-online framework for Meta-Safe Reinforcement Learning

no code implementations26 May 2024 Vanshaj Khattar, Yuhao Ding, Bilgehan Sel, Javad Lavaei, Ming Jin

Meta-reinforcement learning has widely been used as a learning-to-learn framework to solve unseen tasks with limited experience.

Scalable Multi-Agent Reinforcement Learning with General Utilities

no code implementations15 Feb 2023 Donghao Ying, Yuhao Ding, Alec Koppel, Javad Lavaei

The objective is to find a localized policy that maximizes the average of the team's local utility functions without the full observability of each agent in the team.

Multi-agent Reinforcement Learning reinforcement-learning +1

Non-stationary Risk-sensitive Reinforcement Learning: Near-optimal Dynamic Regret, Adaptive Detection, and Separation Design

no code implementations19 Nov 2022 Yuhao Ding, Ming Jin, Javad Lavaei

We study risk-sensitive reinforcement learning (RL) based on an entropic risk measure in episodic non-stationary Markov decision processes (MDPs).

Reinforcement Learning (RL)

Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

1 code implementation22 May 2022 Donghao Ying, Mengzi Amy Guo, Hyunin Lee, Yuhao Ding, Javad Lavaei, Zuo-Jun Max Shen

In the exact setting, we prove an $O(T^{-1/3})$ convergence rate for both the average optimality gap and constraint violation, which further improves to $O(T^{-1/2})$ under strong concavity of the objective in the occupancy measure.

Provably Efficient Primal-Dual Reinforcement Learning for CMDPs with Non-stationary Objectives and Constraints

no code implementations28 Jan 2022 Yuhao Ding, Javad Lavaei

We consider primal-dual-based reinforcement learning (RL) in episodic constrained Markov decision processes (CMDPs) with non-stationary objectives and constraints, which plays a central role in ensuring the safety of RL in time-varying environments.

Reinforcement Learning (RL) Safe Exploration

On the Global Optimum Convergence of Momentum-based Policy Gradient

no code implementations19 Oct 2021 Yuhao Ding, Junzi Zhang, Javad Lavaei

For the generic Fisher-non-degenerate policy parametrizations, our result is the first single-loop and finite-batch PG algorithm achieving $\tilde{O}(\epsilon^{-3})$ global optimality sample complexity.

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

no code implementations17 Oct 2021 Donghao Ying, Yuhao Ding, Javad Lavaei

We study entropy-regularized constrained Markov decision processes (CMDPs) under the soft-max parameterization, in which an agent aims to maximize the entropy-regularized value function while satisfying constraints on the expected total utility.

Ontology-Enhanced Slot Filling

no code implementations25 Aug 2021 Yuhao Ding, Yik-Cheung Tam

In multi-domain task-oriented dialog system, user utterances and system responses may mention multiple named entities and attributes values.

dialog state tracking slot-filling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.