Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training. The approach leverages input perturbations commonly used in computer vision tasks to regularize the value function. Existing model-free approaches, such as Soft Actor-Critic (SAC), are not able to train deep networks effectively from image pixels. However, the addition of our augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based (Dreamer, PlaNet, and SLAC) methods and recently proposed contrastive learning (CURL). Our approach can be combined with any model-free reinforcement learning algorithm, requiring only minor modifications. An implementation can be found at https://sites.google.com/view/data-regularized-q.
PDF Abstract ICLR 2021 PDF ICLR 2021 AbstractCode
Datasets
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Atari Games 100k | Atari 100k | DrQ | Mean Human-Normalized Score | 0.357 | # 15 | |
Medium Human-Normalized Score | 0.268 | # 12 | ||||
Continuous Control | DeepMind Cheetah Run (Images) | DrQ | Return | 660 | # 3 | |
Continuous Control | DeepMind Cup Catch (Images) | DrQ | Return | 963 | # 1 | |
Continuous Control | DeepMind Walker Walk (Images) | DrQ | Return | 921 | # 1 |