by Richard S. Sutton and Andrew G. Barto

- The title of the final section, Section 17.6, was mistakenly printed as a repeat of an earlier sections title. It should be "The Future of Artificial Intelligence."
- Somehow "function approximation" was mistakenly abbrieviated to
"function approx." many times in the printed book. We see this on p205
three times, p208, p236, p257, p260, p284, p382 line 10, p426 6 lines
from
bottom...

- p11, 5 lines from bottom: "(see (Section 16.1))" --> "(Section 16.1)”
- p19, 8 lines from bottom: "(Section 16.2)" --> "(Section 15.9)"
- p30, Exercise 2.2: The values specified for R_1 and R_3 should
have minus signs in front of them

- p64, after the figure: v_pi --> v_*

- p98, start of last paragraph: For Monte Carlo policy evaluation --> For Monte Carlo policy iteration
- p107, the 3rd line is cut off; it should read: "using the
behavior policy that selects right
and left with equal
probability."

- p117, in 5.5: probabalistic --> probabilistic
- p153, end of first paragraph: occuring --> occurring
- p220, middle of page: horizonal --> horizontal
- p229, bottom of page: forgeting --> forgetting
- p248, Exercise 10.1, line 2: "or in" --> "in"

- p302, middle: auxilary --> auxiliary
- p321, after equation (13.1): d should be d'

- p337, bottom: Schall --> Schaal
- p451, middle left: user-targed --> user-targeted
- p460, in paragraph 3, then again in paragraph 4: auxilary --> auxiliary
- p465, middle right: there should be no commas in the list defining tau
- p470, line 10: "function approx." --> "function approximation"

- p510, in Sorg (2011): "Problem:Designing" --> "Problem: Designing”
- p180, line 24: "sill" --> "still" (James R)

- p212, line 1: "length the interval" --> "length of the interval" (Prabhat Nagarajan)
- p212, second to last line: The i index should start at 1, not 0. (Chris Harding)
- p229, In (9.22) and the equation above it labeled (from (9.20)), all the x's should have their time index reduced by 1: x_t --> x_{t-1} and x_{t+1} --> x_t (Frederic Godin)
- p244, line 14: w_t --> w_{t-1} (Frederic Godin)
- p327, 5 lines from bottom: "boxed" --> "boxed algorithm"
(Douglas De Rizzo Meneghetti)

- p350, third line: "Rescoral-Wgner" --> "Rescorla-Wagner" (Kyle Simpson)
- p354, 11 lines from the bottom: "Rescoral-Wagner" --> "Rescorla-Wagner" (Kyle Simpson)
- p400: The left side of (15.3) is missing a logarithm (ln) between
the grad symbol (Nabla) and the policy symbol (pi) (Jiahao Fan)

- p436, eight lines from the bottom: "Tesauro and colleages" --> "Tesauro and colleagues" (Raymund Chua)
- p447, 14 lines from the bottom: "Figure 16.7, were $\theta$ is" -> "Figure 16.7, where $\theta$ is" (Kyle Simpson)
- p504: "Pavlov, P. I." --> "Pavlov, I. P." (Brian Christian)
- p198 2/3 down page: "s \mapsto g" --> "s \mapsto u"
- p200: Second paragraph should begin not with "But", but with "It"
- p203, 5 lines from bottom: "but curving slightly" --> "curving very slightly"
- p204, bottom: "approximate state-value function" --> "approximate the state-value function"
- p241: "approximated by linear combination" --> "approximated by a linear combination" (Prabhat Nagarajan)
- p256 in 10.3: "Tsitiklis" --> "Tsitsiklis" (Prabhat Nagarajan)
- p286 in 11.7: "Mahadeval" --> "Mahadevan" (Prabhat Nagarajan)
- p259, above (11.6): "Expected Sarsa" --> "Sarsa"
(Xiang Gu)

- Thermal soaring (Section 16.8) has since been extended to real gliders. See Reddy, G., Ng, J. W., Celani, A., Sejnowski, T. J., Vergassola, M. (2018). Soaring like a bird via reinforcement learning in the field. 07 Nature 562:236-239.
- In Figure 11.3, on page 267, there is a point essentially
labelled
"(MS)TDE=0", suggesting that there is always a value function in
the representable subspace at which the mean square temporal-difference
error is zero. But in fact there is not. It would be better if this
point was labelled "min (MS)TDE".