The Grand AI Handbook

Classical RL Algorithms

A vibrant survey of foundational RL techniques like policy gradients and epsilon-greedy, shaping the roots of intelligent agents.

Chapter 8: Model-Based RL (Known models, learned models, Dyna architecture) Chapter 9: Value-Based Methods (Q-learning, SARSA, expected SARSA, n-step TD) Chapter 10: Policy Gradient Methods (REINFORCE, actor-critic, baseline subtraction) Chapter 11: Exploration vs. Exploitation (Epsilon-greedy, UCB, Thompson sampling, optimism in uncertainty)