date day topic Assignment due
10-Jan Tuesday Intro, logistics, requirements, expectations
12-Jan Thursday Introduction Read all of Chapter 1; 2 thought questions
17-Jan Thursday Bandit Methods Read the nonstarred sections of Chapter 2 plus 2.8; 2 thought questions
19-Jan Thursday Bandit Methods Exercises 2.1, 2.5, and 2.55; 2.8 is extra credit
24-Jan Tuesday Markov Decision Processes Read all of Chapter 3; 2 thought questions
26-Jan Thursday Value Functions Exercises 3.4, 3.5, (3.6 is extra credit), 3.8 (omit final part re eq 3.10), 3.9, 3.10, 3.11, 3.15, 3.17
31-Jan Tuesday RL-Glue, RL-Library Implement Party Problem MDP; generate 50 episodes with the random policy and compute the average return at start state; compute state values
2-Feb Thursday Dynamic Programming Read Chapter 4; 2 thought questions
7-Feb Tuesday Dynamic Programming Exercises 4.1, 4.2, 4.3, 4.5, 4.9; Implement policy iteration on the Party Problem; show sequence of policies and value fns, starting with the policy that always parties
9-Feb Thursday Monte Carlo Methods Read Chapter 5; 2 thought questions
14-Feb Tuesday Monte Carlo Control Exercises 5.1, 5.2, 5.5
16-Feb Thursday Temporal Difference Learning Read Chapter 6; 2 thought questions; apply MC ES to the blackjack environment using RL-Glue; plot policies for the 'twice/half as many 10s' cases
28-Feb Tuesday Temporal Difference Learning Exercises 6.1,6.2,6.3,6.8,6.9,6.10,6.12
2-Mar Thursday Special lecture; the exam questions Apply Sarsa control, e-greedy with epsilon=0.1, to the cat-and-mouse problem [cancelled]
7-Mar Tuesday Midterm Exam
9-Mar Thursday Integrating Monte Carlo and Temporal-difference Methods Read Chapter 7; 2 thought questions
14-Mar Tuesday Eligibility Traces Exercises 7.2 and 7.6
16-Mar Thursday Function Approximation Read Chapter 8; 2 thought questions
21-Mar Tuesday Function Approximation Exercises 8.1, 8.2, 8.6 and 8.7; First function approx programming assignment
23-Mar Thursday Policy Gradient Methods with Function Approximation Reading: LSTD(lambda) by Boyan, 2 thought questions
28-Mar Tuesday Integrating Learning and Planning: Dyna Read Chapter 9; 2 thought questions; 2nd function approx programming assignment
30-Mar Thursday Model-based backups Exercises 9.1,9.2,9.3,9.5 (9.6 is extra credit); Read Chapter 10
4-Apr Tuesday Guest lecture by Michael Bowling
Read Bowling and Veloso paper, 2 thought questions; 1-page mini-project proposal due
6-Apr Thursday Advanced topic: Temporal abstraction and hierarchy Read the options paper sections 1-3 and 7-8, 2 thought questions
11-Apr Tuesday Advanced topic: Hidden State: POMDPs, PSRs, TD nets Read Chapter 11, 2 thought questions
13-Apr Thursday mini-projects
mini-project due
19-Apr Wednesday final exam MEC 4 3

Extend this Page   How to edit   Style   Subscribe   Notify   Suggest   Help   This open web page hosted at the University of Alberta.   Terms of use  2958/5