The Grand AI Handbook

Exploration and Representation Learning in RL

A focus on RND and ICM to drive exploration and improve how agents understand environments.

Chapter 31: Exploration Mechanisms Problem Definition and Research Motivation Research Directions: Balancing exploration/exploitation Classic Exploration Mechanisms: Epsilon-greedy, UCB, Thompson sampling Curiosity and Intrinsic Motivation: Curiosity-driven RL: ICM (Intrinsic Curiosity Module) Intrinsic Motivation: RND (Random Network Distillation), novelty-seeking Goal-Oriented Exploration: HER (Hindsight Experience Replay) Memory-Based Exploration: R2D2 Other Exploration Mechanisms: Novelty search, diversity-driven RL Future Study: Generalizable exploration, lifelong learning References Chapter 32: Representation Learning for RL (State embeddings, contrastive learning, bisimulation metrics) Chapter 33: Self-Supervised RL (Unsupervised skill discovery, DIAYN, APS) Chapter 34: Robust RL (Adversarial training, domain randomization, robust MDPs)