no code implementations • 11 Jul 2023 • Michael Przystupa, Faezeh Haghverd, Martin Jagersand, Samuele Tosatto
Movement primitives are trainable parametric models that reproduce robotic movements starting from a limited set of demonstrations.
1 code implementation • 6 Dec 2022 • Amirmohammad Karimi, Jun Jin, Jun Luo, A. Rupam Mahmood, Martin Jagersand, Samuele Tosatto
In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals.
1 code implementation • 4 Feb 2022 • Samuele Tosatto, Andrew Patterson, Martha White, A. Rupam Mahmood
The policy gradient theorem (Sutton et al., 2000) prescribes the usage of a cumulative discounted state distribution under the target policy to approximate the gradient.
1 code implementation • 22 Dec 2021 • Shivam Garg, Samuele Tosatto, Yangchen Pan, Martha White, A. Rupam Mahmood
Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions.
1 code implementation • 9 Mar 2021 • Qingfeng Lan, Samuele Tosatto, Homayoon Farrahi, A. Rupam Mahmood
As a key component in reinforcement learning, the reward function is usually devised carefully to guide the agent.
no code implementations • 27 Oct 2020 • Samuele Tosatto, João Carvalho, Jan Peters
Off-policy Reinforcement Learning (RL) holds the promise of better data efficiency as it allows sample reuse and potentially enables safe interaction with the environment.
no code implementations • 26 Oct 2020 • Samuele Tosatto, Georgia Chalvatzaki, Jan Peters
Parameterized movement primitives have been extensively used for imitation learning of robotic tasks.
no code implementations • 26 Feb 2020 • Samuele Tosatto, Jonas Stadtmueller, Jan Peters
The empirical analysis shows that the dimensionality reduction in parameter space is more effective than in configuration space, as it enables the representation of the movements with a significant reduction of parameters.
no code implementations • 29 Jan 2020 • Samuele Tosatto, Riad Akrour, Jan Peters
The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity.
1 code implementation • 8 Jan 2020 • Samuele Tosatto, Joao Carvalho, Hany Abdulsamad, Jan Peters
Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes.
no code implementations • ICML 2017 • Samuele Tosatto, Matteo Pirotta, Carlo D’Eramo, Marcello Restelli
This paper is about the study of B-FQI, an Approximated Value Iteration (AVI) algorithm that exploits a boosting procedure to estimate the action-value function in reinforcement learning problems.