Reinforcement Learning and Artificial
Intelligence (RLAI) NIPS Workshop on Reinforcement Learning: Benchmarks and Bake-offs |
The workshop on Reinforcement
Learning: Benchmarks and Bake-offs
will be held on December 17 as part of the NIPS conference.
The workshop will explore the
establishment of a standard set of
benchmarks and a series of competitive events (bake-offs) to enhance
reinforcement-learning research. The workshop will ideally produce
the following outputs: 1) a proposed specification for implementing
benchmark problems, 2) identification of a list of initial benchmarks,
with assignment of responsibility for their implementation, 3)
policies for extending the benchmark set to address new issues, 4)
specific proposals for a series of competitive events comparing
different reinforcement-learning methods on various kinds of problems,
and 5) the formation of a policy committee to guide the construction
and evolution of the benchmarks and competitions. For the
purposes of
this workshop, reinforcement learning (RL) is meant to include a broad
range of interactive learning problems, including POMDPs, navigation,
control problems, probabilistic planning, and sequential prediction
problems with and without actions.
It has often been suggested that the field of reinforcement learning
would benefit from the establishment of standard benchmark problems
and regular competitive events. Isolated efforts have
faltered due to a lack of "buy in" from the community.
Competitions
can greatly increase the interest and focus in an area by clarifying
its objectives and challenges, publicly acknowledging the best
algorithms, and generally making the area more exciting and enjoyable.
Standard benchmarks can make it much easier to apply new algorithms to
existing problems and thus provide clear first steps toward their
evaluation. Competitions and benchmark problems that can be used
in
these ways have have yet to be established. Some reasons are:
The purpose of this workshop is to explore innovative approaches
to overcoming these issues (leading to outputs 1-5 above).
Examples
of successful prior standards efforts include the UCI database,
RoboCup, the International Planning Competitions, the AAAI Robot
Challenge, and the Trading Agents Competition. We will endeavor to
attract experts in each of these competitions to report on their
successes and failures, strengths and weaknesses, lessons learned,
etc.
The workshop will consist of an outline of the goals and scope of
the
workshop from the organizers, followed by reports from representatives
from existing competitions and benchmark sets, followed by working
sessions to produce workshop outputs 1-5. Slightly more than half
the
time will be reserved for general discussion and working sessions
Richard S. Sutton, University of Alberta,
Alberta, CANADA
Michael L. Littman, Rutgers University, New Jersey, USA
Peter Stone (RoboCup)
Sven Koenig (IPC)
Michael Littman (IPC, probabilistic track)
Alan Schultz (AAAI Robotics Challenge)
David Aha (UCI database)
Michael Wellman (TAC)
Participants might include:
The workshop will last for one day, but we hope that it will be on
the first day so that more detailed discussion and work can continue
on the day after.
I suggest that the Workshop page have a suggestions section so that those of us who are unlikely to attend can contribute ideas and so that everyone can consider the issues prior to the Workshop.
The potential problem of benchmarks discouraging new problem settings might be partly avoided by having a "language" for specifying problems. The chosen benchmarks at any time would be specific instances of problems from the universe of problems generated by that language. This would allow researchers to generate scaled and related problems for their own exploratory purposes and allow new benchmarks to be created more easily than if they were one-off exercises.
I would like to see families of benchmarks that are related but vary on a number of dimensions (e.g. complexity, noisiness, degree of look-ahead required, etc). A single benchmark does not, of itself, assist with decomposing the causal factors of performance. The concepts from Paul Cohen's "Empirical methods for Artificial Intelligence" would be relevant here.
I am interested in problems that have recursive structure (i.e. where attainment of subgoals is important to performance) so would like the problem generation language to allow for this.
I am interested in problems that have relational structure (e.g. the alignment of objects in a 2-D world indicates the direction to a goal) so would like the problem generation language to allow for this.
Many problems in the literature have a spatial aspect (i.e. 2-D or 3-D worlds for robotics). I would be uncomfortable if the problem generation language was necessarily spatial. I would prefer to see a language that could generate spatial problems as special cases.
Ross Gayler