Search Results for author: Wendelin Böhmer

Found 16 papers, 5 papers with code

A Penalty-Based Guardrail Algorithm for Non-Decreasing Optimization with Inequality Constraints

no code implementations • 3 May 2024 • Ksenija Stepanovic, Wendelin Böhmer, Mathijs de Weerdt

This algorithm adapts a standard penalty-based method by dynamically updating the right-hand side of the constraints with a guardrail variable which adds a margin to prevent violations.

Paper
Add Code

To the Max: Reinventing Reward in Reinforcement Learning

no code implementations • 2 Feb 2024 • Grigorii Veviurko, Wendelin Böhmer, Mathijs de Weerdt

In reinforcement learning (RL), different rewards can define the same optimal policy but result in drastically different learning performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

You Shall Pass: Dealing with the Zero-Gradient Problem in Predict and Optimize for Convex Optimization

no code implementations • 30 Jul 2023 • Grigorii Veviurko, Wendelin Böhmer, Mathijs de Weerdt

The key challenge to train such models is the computation of the Jacobian of the solution of the optimization problem with respect to its parameters.

Decision Making

Paper
Add Code

Diverse Projection Ensembles for Distributional Reinforcement Learning

no code implementations • 12 Jun 2023 • Moritz A. Zanger, Wendelin Böhmer, Matthijs T. J. Spaan

In contrast to classical reinforcement learning, distributional reinforcement learning algorithms aim to learn the distribution of returns rather than their expected value.

Distributional Reinforcement Learning Inductive Bias +1

Paper
Add Code

The Role of Diverse Replay for Generalisation in Reinforcement Learning

no code implementations • 9 Jun 2023 • Max Weltevrede, Matthijs T. J. Spaan, Wendelin Böhmer

We motivate mathematically and show empirically that generalisation to tasks that are "reachable'' during training is improved by increasing the diversity of transitions in the replay buffer.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Active Classification of Moving Targets with Learned Control Policies

no code implementations • 6 Dec 2022 • Álvaro Serra-Gómez, Eduardo Montijano, Wendelin Böhmer, Javier Alonso-Mora

In this paper, we consider the problem where a drone has to collect semantic information to classify multiple moving targets.

Classification Reinforcement Learning (RL)

Paper
Add Code

E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty

no code implementations • 21 Oct 2022 • Yaniv Oren, Matthijs T. J. Spaan, Wendelin Böhmer

One of the most well-studied and highly performing planning approaches used in Model-Based Reinforcement Learning (MBRL) is Monte-Carlo Tree Search (MCTS).

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

no code implementations • 6 Oct 2020 • Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson

VDN and QMIX are two popular value-based algorithms for cooperative MARL that learn a centralized action value function as a monotonic mixing of per-agent utilities.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Add Code

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

2 code implementations • 7 Jun 2020 • Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha

Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities.

counterfactual Multi-agent Reinforcement Learning +3

Paper
Code

FACMAC: Factored Multi-Agent Centralised Policy Gradients

3 code implementations • NeurIPS 2021 • Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson

We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.

Q-Learning SMAC +2

315

Paper
Code

Optimistic Exploration even with a Pessimistic Initialisation

1 code implementation • ICLR 2020 • Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson

We show that this scheme is provably efficient in the tabular setting and extend it to the deep RL setting.

Efficient Exploration Q-Learning +1

Paper
Code

Deep Coordination Graphs

2 code implementations • ICML 2020 • Wendelin Böhmer, Vitaly Kurin, Shimon Whiteson

This paper introduces the deep coordination graph (DCG) for collaborative multi-agent reinforcement learning.

Multi-agent Reinforcement Learning Q-Learning +4

765

Paper
Code

Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning

no code implementations • 5 Jun 2019 • Wendelin Böhmer, Tabish Rashid, Shimon Whiteson

This paper investigates the use of intrinsic reward to guide exploration in multi-agent reinforcement learning.

Multi-agent Reinforcement Learning Q-Learning +2

Paper
Add Code

Multitask Soft Option Learning

1 code implementation • 1 Apr 2019 • Maximilian Igl, Andrew Gambardella, Jinke He, Nantas Nardelli, N. Siddharth, Wendelin Böhmer, Shimon Whiteson

We present Multitask Soft Option Learning(MSOL), a hierarchical multitask framework based on Planning as Inference.

Transfer Learning

Paper
Code

Non-Deterministic Policy Improvement Stabilizes Approximated Reinforcement Learning

no code implementations • 22 Dec 2016 • Wendelin Böhmer, Rong Guo, Klaus Obermayer

This paper investigates a type of instability that is linked to the greedy policy improvement in approximated reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Regression with Linear Factored Functions

no code implementations • 19 Dec 2014 • Wendelin Böhmer, Klaus Obermayer

Many applications that use empirically estimated functions face a curse of dimensionality, because the integrals over most function classes must be approximated by sampling.

Gaussian Processes regression +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.