Paper

Alchemy: A structured task distribution for meta-reinforcement learning.

There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled analysis. In the present work, we introduce a new benchmark for meta-RL research, which combines structural richness with structural transparency. Alchemy is a 3D video game, implemented in Unity, which involves a latent causal structure that is resampled procedurally from episode to episode, affording structure learning, online inference, hypothesis testing and action sequencing based on abstract domain knowledge. We evaluate a pair of powerful RL agents on Alchemy and present an in-depth analysis of one of these agents. Results clearly indicate a frank and specific failure of meta-learning, providing validation for Alchemy as a challenging benchmark for meta-RL. Concurrent with this report, we are releasing Alchemy as public resource, together with a suite of analysis tools and sample agent trajectories.

arXiv (Cornell University)Published 2021-02-04Paper link PDF

Authors: Jane X. Wang · Michael C. King · Nicolas Porcel · Zeb Kurth‐Nelson · Tina Zhu · Charlie Deck · Peter Choy · Mary Cassin · Malcolm Reynolds · Hao Song · Gavin Buttimore · David Reichert · Neil C. Rabinowitz · Löıc Matthey · Demis Hassabis · Alexander Lerchner · Matthew Botvinick

Topics

Agents

Relevant entities

Company: Meta

People

openalex-author

Demis Hassabis

CEO

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.