no code implementations • 3 May 2024 • Georgios Tzannetos, Parameswaran Kamalaruban, Adish Singla
a target distribution over complex tasks.
no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban, Georgios Tzannetos, Goran Radanović, Adish Singla
Moreover, we extend our analysis to the approximate optimization setting and derive exponentially decaying convergence rates for both RLHF and DPO.
1 code implementation • 10 Feb 2024 • Rati Devidze, Parameswaran Kamalaruban, Adish Singla
Reward functions are central in specifying the task we want a reinforcement learning agent to perform.
no code implementations • 9 Feb 2024 • Debmalya Mandal, Andi Nika, Parameswaran Kamalaruban, Adish Singla, Goran Radanović
We aim to design algorithms that identify a near-optimal policy from the corrupted data, with provable guarantees.
1 code implementation • 25 Apr 2023 • Georgios Tzannetos, Bárbara Gomes Ribeiro, Parameswaran Kamalaruban, Adish Singla
We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings.
no code implementations • 13 Apr 2023 • Umang Bhatt, Valerie Chen, Katherine M. Collins, Parameswaran Kamalaruban, Emma Kallina, Adrian Weller, Ameet Talwalkar
In this work, we propose learning a decision support policy that, for a given input, chooses which form of support, if any, to provide.
1 code implementation • 12 Feb 2022 • Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Craig Innes, Subramanian Ramamoorthy, Adrian Weller
Imitation learning (IL) is a popular paradigm for training policies in robotic systems when specifying the reward function is difficult.
1 code implementation • NeurIPS 2021 • Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla
By being explicable, we seek to capture two properties: (a) informativeness so that the rewards speed up the agent's convergence, and (b) sparseness as a proxy for ease of interpretability of the rewards.
no code implementations • AAAI Workshop AdvML 2022 • Dishanika Dewani Denipitiyage, Thalaiyasingam Ajanthan, Parameswaran Kamalaruban, Adrian Weller
Lately, the literature on adversarial robustness spans from images to other domains such as point clouds.
1 code implementation • NeurIPS 2021 • Gaurav Yengera, Rati Devidze, Parameswaran Kamalaruban, Adish Singla
In particular, we study how to design a personalized curriculum over demonstrations to speed up the learner's convergence.
1 code implementation • NeurIPS 2021 • Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Adrian Weller, Volkan Cevher
We study the inverse reinforcement learning (IRL) problem under a transition dynamics mismatch between the expert and the learner.
no code implementations • 1 Jul 2020 • Martin Troussard, Emmanuel Pignat, Parameswaran Kamalaruban, Sylvain Calinon, Volkan Cevher
This paper proposes an inverse reinforcement learning (IRL) framework to accelerate learning when the learner-teacher \textit{interaction} is \textit{limited} during training.
no code implementations • 23 Jun 2020 • Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
However, the applicability of potential-based reward shaping is limited in settings where (i) the state space is very large, and it is challenging to compute an appropriate potential function, (ii) the feedback signals are noisy, and even with shaped rewards the agent could be trapped in local optima, and (iii) changing the rewards alone is not sufficient, and effective shaping requires changing the dynamics.
1 code implementation • 14 Feb 2020 • Parameswaran Kamalaruban, Yu-Ting Huang, Ya-Ping Hsieh, Paul Rolland, Cheng Shi, Volkan Cevher
We introduce a sampling perspective to tackle the challenging task of training robust Reinforcement Learning (RL) agents.
no code implementations • 1 Dec 2019 • Donghwan Lee, Niao He, Parameswaran Kamalaruban, Volkan Cevher
This article reviews recent advances in multi-agent reinforcement learning algorithms for large-scale control systems and communication networks, which learn to communicate and cooperate.
Distributed Optimization Multi-agent Reinforcement Learning +2
no code implementations • 28 May 2019 • Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher.
no code implementations • 8 Nov 2018 • Teresa Yeo, Parameswaran Kamalaruban, Adish Singla, Arpit Merchant, Thibault Asselborn, Louis Faucon, Pierre Dillenbourg, Volkan Cevher
We consider the machine teaching problem in a classroom-like setting wherein the teacher has to deliver the same examples to a diverse group of students.
no code implementations • 6 Jun 2018 • Parameswaran Kamalaruban, Victor Perrier, Hassan Jameel Asghar, Mohamed Ali Kaafar
However, it provides the same level of protection for all elements (individuals and attributes) in the data.
no code implementations • 20 May 2018 • Parameswaran Kamalaruban, Robert C. Williamson, Xinhua Zhang
In special cases like the Aggregating Algorithm (\cite{vovk1995game}) with mixable losses and the Weighted Average Algorithm (\cite{kivinen1999averaging}) with exp-concave losses, it is possible to achieve $O(1)$ regret bounds.
no code implementations • 20 May 2018 • Parameswaran Kamalaruban
This thesis presents some geometric insights into three different types of two player prediction games -- namely general learning task, prediction with expert advice, and online convex optimization.
no code implementations • 20 May 2018 • Parameswaran Kamalaruban, Robert C. Williamson
The cost-sensitive classification problem plays a crucial role in mission-critical machine learning applications, and differs with traditional classification by taking the misclassification costs into consideration.
no code implementations • NeurIPS 2017 • Kush Bhatia, Prateek Jain, Parameswaran Kamalaruban, Purushottam Kar
We present the first efficient and provably consistent estimator for the robust regression problem.
no code implementations • 8 Sep 2016 • Parameswaran Kamalaruban
Online Convex Optimization plays a key role in large scale machine learning.
no code implementations • 1 Jul 2016 • Kush Bhatia, Prateek Jain, Parameswaran Kamalaruban, Purushottam Kar
We illustrate our methods on synthetic datasets and show that our methods indeed are able to consistently recover the optimal parameters despite a large fraction of points being corrupted.