Code for:
Below are links to a variety of software related to examples and
exercises in the book.
And below is some of the code that Rich used to generate the examples
and
figures in the 2nd edition (made available as is):
- Chapter 1: Introduction
- Chapter 2: Multi-armed Bandits
- Chapter 3: Finite Markov Decision Processes
- Chapter 4: Dynamic Programming
- Chapter 5: Monte Carlo Methods
- Chapter 6: Temporal-Difference Learning
- Chapter 7: n-step Bootstrapping
- N-step TD on the Random Walk, Example 7.1, Figure 7.2: online and offline
(Lisp). In C.
- Chapter 8: Planning and Learning with Tabular Methods
- Chapter 9: On-policy Prediction with Approximation
- Chapter 10: On-policy Control with Approximation
- R-learning on Access-Control Queuing Task, Example 6.7,
Figure 6.17 (Lisp), (C
version)
- Chapter 11: Off-policy Methods with Approximation
- Chapter 12: Eligibility Traces
- Chapter 13: Policy Gradient Methods