Search Results for author: Enrique Munoz de Cote

Found 5 papers, 0 papers with code

Fully Distributed Actor-Critic Architecture for Multitask Deep Reinforcement Learning

no code implementations • 23 Oct 2021 • Sergio Valcarcel Macua, Ian Davies, Aleksi Tukiainen, Enrique Munoz de Cote

We propose a fully distributed actor-critic architecture, named Diff-DAC, with application to multitask reinforcement learning (MRL).

Continuous Control reinforcement-learning +1

Paper
Add Code

Compatible features for Monotonic Policy Improvement

no code implementations • 9 Oct 2019 • Marcin B. Tomczak, Sergio Valcarcel Macua, Enrique Munoz de Cote, Peter Vrancx

In this work we establish conditions under which the parametric approximation of the critic does not introduce bias to the updates of surrogate objective.

Paper
Add Code

Adaptive Sensor Placement for Continuous Spaces

no code implementations • 16 May 2019 • James A. Grant, Alexis Boukouvalas, Ryan-Rhys Griffiths, David S. Leslie, Sattar Vakili, Enrique Munoz de Cote

We consider the problem of adaptively placing sensors along an interval to detect stochastically-generated events.

Thompson Sampling

Paper
Add Code

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

no code implementations • 28 Oct 2017 • Sergio Valcarcel Macua, Aleksi Tukiainen, Daniel García-Ocaña Hernández, David Baldazo, Enrique Munoz de Cote, Santiago Zazo

We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity

no code implementations • 28 Jul 2017 • Pablo Hernandez-Leal, Michael Kaisers, Tim Baarslag, Enrique Munoz de Cote

The key challenge in multiagent learning is learning a best response to the behaviour of other agents, which may be non-stationary: if the other agents adapt their strategy as well, the learning target moves.

Multi-Armed Bandits

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.