Next:
1. Introduction
Up:
Book
Previous:
Summary of Notation
Contents
I. The Problem
Subsections
1. Introduction
1.1 Reinforcement Learning
1.2 Examples
1.3 Elements of Reinforcement Learning
1.4 An Extended Example: Tic-Tac-Toe
1.5 Summary
1.6 History of Reinforcement Learning
1.7 Bibliographical Remarks
2. Evaluative Feedback
2.1 An
-Armed Bandit Problem
2.2 Action-Value Methods
2.3 Softmax Action Selection
2.4 Evaluation Versus Instruction
2.5 Incremental Implementation
2.6 Tracking a Nonstationary Problem
2.7 Optimistic Initial Values
2.8 Reinforcement Comparison
2.9 Pursuit Methods
2.10 Associative Search
2.11 Conclusions
2.12 Bibliographical and Historical Remarks
2.1
2.2
2.3
2.4
2.5-6
2.8
2.9
2.10
2.11
3. The Reinforcement Learning Problem
3.1 The Agent-Environment Interface
3.2 Goals and Rewards
3.3 Returns
3.4 Unified Notation for Episodic and Continuing Tasks
3.5 The Markov Property
3.6 Markov Decision Processes
3.7 Value Functions
3.8 Optimal Value Functions
3.9 Optimality and Approximation
3.10 Summary
3.11 Bibliographical and Historical Remarks
3.1
3.3-4
3.5
3.6
3.7-8
Mark Lee 2005-01-04