|
Reinforcement Learning and
Artificial
Intelligence (RLAI)
|
Description
of CMPUT
607: Reinforcement Learning in
Artificial Intelligence
|
The ambition of this
page is to provide basic, background information describing CMPUT
607, a course in the fall term of 2007 at the
University of Alberta.
Instructor: Csaba Szepesvári,
(szepesva
at cs ..) (http://www.cs.ualberta.ca/~szepesva),
Mohammad Ghavamzadeh,
(mgh
at cs ..) (http://www.cs.ualberta.ca/~mgh/)
Office
hours:
Tuesdays after class, 15:30--16:30. Csaba (Ath 3-11),
Thursdays after class, 15:30--16:30. Mohammad (CSC
3-55).
Class
Times: Tuesday and Thursday,
14:00-15:20 Class Room:
DP
4069
Description: This course will provide a comprehensive
introduction to reinforcement learning as an approach to artificial
intelligence, emphasizing the design of complete agents interacting
with
stochastic, incompletely known environments. Reinforcement learning has
adapted
key ideas from machine learning, operations research, psychology, and
neuroscience to produce some strikingly successful engineering
applications.
The focus is on algorithms for learning what actions to take, and when
to take
them, so as to optimize long-term performance. This may involve
sacrificing
immediate reward to obtain greater reward in the long-term or just to
obtain
more information about the environment. The course will cover Markov
decision
processes, dynamic programming, temporal-difference learning, Monte
Carlo
reinforcement learning methods, eligibility traces, the role of
function
approximation, and the integration of learning and planning. The course
will
emphasize the development of intuition relating the mathematical theory
of
reinforcement learning to the design of human-level artificial
intelligence.
Textbook: Reinforcement
Learning: An Introduction,
by Richard S. Sutton and Andrew G. Barto. Although
a version of the textbook is available online (html,
pdf
available from within UofA only), students are strongly
encouraged to get
their hands on the physical textbook. Much
of the
readings and questions will come directly from
the book. The textbook is available in the
bookstore.
NEW: In
addition to the textbook material we will cover some additional topics.
The provisional plan is as follows:
- Abstractions with hierarchies - Hierarchical reinforcement learning
- Least-squares methods in reinforcement learning
- Policy gradient and Actor-Critic algorithms
- Linear programming
Prerequisites: Interest in learning approaches to
artificial
intelligence; basic probability theory; computer programming ability.
You
should be comfortable with statistical ideas such as probability
distributions
and expected values.
Familiarity with
linear algebra would be helpful but is not required.
Written
Exercises: There will be a small set
of exercises for most chapters. These will be due at the beginning of
the day one week after the chapter is covered in class. All exercises
will be
marked and
returned to you. Answer sheets for each week's exercises will be made
available
at the class on the day on which the exercises are due, so your
exercises must
be turned in on time.
Programming
Exercises: There will be a few
small programming exercises. These will
not be
extensive programming projects, but will provide hands-on experience
with the
abilities
and limitations of the algorithms.
Additional Reading: Students
will do additional readings and answer questions related to these in
written exercises.
Rules of the game: We follow the department policies, described here. The
permitted collaboration model when doing homeworks is consultation,
the absence policy is here.
Grading will be on the
basis of:
5%
Class participation
25% Written exercises
25% Programming exercises
20% Midterm
25% Final