RLAI Reinforcement Learning and Artificial Intelligence (RLAI)

Description of CMPUT 607: Reinforcement Learning in Artificial Intelligence


The ambition of this page is to provide basic, background information describing CMPUT 607, a course in the fall term of 2007 at the University of Alberta.

Instructor: Csaba Szepesvári, (szepesva at cs ..)  (http://www.cs.ualberta.ca/~szepesva),
                  
Mohammad Ghavamzadeh, (mgh at cs ..) (http://www.cs.ualberta.ca/~mgh/)

Office hours: Tuesdays after class, 15:30--16:30. Csaba (Ath 3-11),
                       Thursdays after class, 15:30--16:30. Mohammad (CSC 3-55).

Class Times: Tuesday and Thursday, 14:00-15:20   Class Room: DP 4069

Description:
This course will provide a comprehensive introduction to reinforcement learning as an approach to artificial intelligence, emphasizing the design of complete agents interacting with stochastic, incompletely known environments. Reinforcement learning has adapted key ideas from machine learning, operations research, psychology, and neuroscience to produce some strikingly successful engineering applications. The focus is on algorithms for learning what actions to take, and when to take them, so as to optimize long-term performance. This may involve sacrificing immediate reward to obtain greater reward in the long-term or just to obtain more information about the environment. The course will cover Markov decision processes, dynamic programming, temporal-difference learning, Monte Carlo reinforcement learning methods, eligibility traces, the role of function approximation, and the integration of learning and planning. The course will emphasize the development of intuition relating the mathematical theory of reinforcement learning to the design of human-level artificial intelligence.

Textbook:
Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto.  Although a version of the textbook is available online (html, pdf available from within UofA only), students are strongly encouraged to get their hands on the physical textbook.  Much of the readings and questions will come directly from the book. The textbook is available in the bookstore.

NEW:
In addition to the textbook material we will cover some additional topics. The provisional plan is as follows:

- Abstractions with hierarchies - Hierarchical reinforcement learning
- Least-squares methods in reinforcement learning
- Policy gradient and Actor-Critic algorithms
- Linear programming

Prerequisites:
Interest in learning approaches to artificial intelligence; basic probability theory; computer programming ability. You should be comfortable with statistical ideas such as probability distributions and expected values. Familiarity with linear algebra would be helpful but is not required.

Written Exercises:
There will be a small set of exercises for most chapters. These will be due at the beginning of the day one week after the chapter is covered in class. All exercises will be marked and returned to you. Answer sheets for each week's exercises will be made available at the class on the day on which the exercises are due, so your exercises must be turned in on time.

Programming Exercises: There will be a few small programming exercises.  These will not be extensive programming projects, but will provide hands-on experience with the abilities and limitations of the algorithms.

Additional Reading:
Students will do additional readings and answer questions related to these in written exercises.

Rules of the game:
We follow the department policies, described here. The permitted collaboration model when doing homeworks is
consultation, the absence policy is here.

Grading will be on the basis of:

5%      Class participation
25%   
Written exercises
25%   
Programming exercises
20%    Midterm
25%   
Final


Extend this Page   How to edit   Style   Subscribe   Notify   Suggest   Help   This open web page hosted at the University of Alberta.   Terms of use  1319/2