no code implementations • 7 Mar 2024 • Hoang Giang Pham, Tien Thanh Dam, Ngan Ha Duong, Tien Mai, Minh Hoang Ha
To tackle this problem, we explore three types of valid cuts, namely, outer-approximation and submodular cuts to handle the nonlinear objective function, as well as sub-tour elimination cuts to address the complex routing constraints.
1 code implementation • 20 Feb 2024 • Huy Hoang, Tien Mai, Pradeep Varakantham
In this paper, we propose an offline IL approach that leverages the larger set of sub-optimal demonstrations while effectively mimicking expert trajectories.
1 code implementation • 16 Dec 2023 • Huy Hoang, Tien Mai, Pradeep Varakantham
In an exhaustive set of experiments, we demonstrate that our approach is able to outperform top benchmark approaches for solving Constrained RL problems, with respect to expected cost, CVaR cost, or even unknown cost constraints.
no code implementations • 10 Oct 2023 • The Viet Bui, Tien Mai, Thanh Hong Nguyen
This paper concerns imitation learning (IL) (i. e, the problem of learning to mimic expert behaviors from demonstrations) in cooperative multi-agent systems.
no code implementations • 20 Aug 2023 • The Viet Bui, Tien Mai, Thanh Hong Nguyen
Training agents in multi-agent competitive games presents significant challenges due to their intricate nature.
no code implementations • 7 Jun 2023 • Hung Tran, Tien Mai
In this paper, to address this, we propose a random utility maximization (RUM) based model that considers each subset of choice alternatives as a composite alternative, where individuals choose a subset according to the RUM framework.
no code implementations • 27 Jan 2023 • Hao Jiang, Tien Mai, Pradeep Varakantham, Minh Huy Hoang
Constrained Reinforcement Learning has been employed to enforce safety constraints on policy through the use of expected cost constraints.
no code implementations • 30 Oct 2022 • The Viet Bui, Tien Mai, Thanh H. Nguyen
The core idea of our new algorithm is to create a new imitator to imitate the victim agent's policy while the adversarial policy will be trained not only based on interactions with the victim agent but also based on feedback from the imitator to forecast victim's intention.
no code implementations • 20 Aug 2022 • The Viet Bui, Tien Mai, Patrick Jaillet
We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories.
no code implementations • 31 May 2022 • Avinandan Bose, Arunesh Sinha, Tien Mai
Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems.
no code implementations • 27 Apr 2022 • Tien Mai, The Viet Bui, Quoc Phong Nguyen, Tho V. Le
This work concerns the estimation of recursive route choice models in the situation that the trip observations are incomplete, i. e., there are unconnected links (or nodes) in the observations.
no code implementations • 31 Dec 2021 • Tien Mai, Patrick Jaillet
Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications.
no code implementations • 18 Aug 2020 • Tien Mai, Patrick Jaillet
We show that the entropy-regularized MDP is equivalent to a stochastic MDP model, and is strictly subsumed by the general regularized MDP.
no code implementations • 16 Nov 2019 • Tien Mai, Quoc Phong Nguyen, Kian Hsiang Low, Patrick Jaillet
We consider the problem of recovering an expert's reward function with inverse reinforcement learning (IRL) when there are missing/incomplete state-action pairs or observations in the demonstrated trajectories.
no code implementations • 16 Nov 2019 • Tien Mai, Kennard Chan, Patrick Jaillet
We consider the problem of learning from demonstrated trajectories with inverse reinforcement learning (IRL).