Next:
3 The Reinforcement Learning
3 The Reinforcement Learning Problem
3.1 The Agent-Environment Interface
3.2 Goals and Rewards
3.3 Returns
3.4 A Unified Notation for Episodic and Continual Tasks
3.5 The Markov Property
3.6 Markov Decision Processes
3.7 Value Functions
3.8 Optimal Value Functions
3.9 Optimality and Approximation
3.10 Summary
3.11 Bibliographical and Historical Remarks
About this document ...
Richard Sutton
Sat May 31 13:56:52 EDT 1997