|
Reinforcement Learning and
Artificial
Intelligence (RLAI)
|
Reinforcement
Learning
Toolkit |
The ambition
of this web
page is to provide source, documentation, and updates for the
Reinforcement Learning (RL) toolkit. This toolkit is a collection of
utilities
and demos developed by the RLAI group which may be useful for anyone
trying to learn, teach or use reinforcement learning. The tools are
suitable
for
a range of users, from new users who have never used RL before, to very
experienced users.
The RL toolkit is in the public domain and can be used by anyone for
any purpose.
This software still works, but is being only lightly maintained. For
updates send email to rich@richsutton.com.
Converted to python 3.5 on Oct 2016. -rss
Toolkit Contents
Downloads
Requirements
Installation Instructions
Usage Instructions
Using the
Gridworld Demo
Importing the Whole Toolkit at Once
Importing Specific Tools
Running the Demos
Previous Toolkit Versions and Change Information
Toolkit Contents:
- The Reinforcement Learning Interface
(RLI) (module
RLinterface
) - a standard for
interconnecting RL agents and
environments. This interface allows you to simply specify an
agent function and an environment function and then easily simulate
steps or episodes of them interacting.
- Tile Coding (
tiles
)
- tile coding is a very
useful way to handle large or continuous state spaces. The software
here includes collision handling, as well as wrap versions.
- Eligibility Traces (module
traces
)
- ...
- Utilities (module
utilities
) - (probability,
statistics, etc.)
- Graphical Utilities (quickgraph)
- g
- a low level
graphics
package using
Python and Tk. This is a very easy to use graphics package which will
enable you to quickly make graphical displays of your application.
We've used it to make some fine demos.
- graph -
a quick
graphing tool based
on g
- graph3d
- a 3d
drawing package
based on g
Note: g is based on tk. You must have tk installed on your
machine, and
the Tkinter package for python. You must also be running in a windowing
environment to be able to
run these and create new windows.
- A variety of agents and environments ...
- Demos:
- Mountain Car (both graphical and
non graphical versions)
- Maintenance Example - a simple mdp where the agent tries to
maximize a machine running time.
- Gridworld (both graphical and non
graphical versions) - the agent must make its way through a gridworld
with barriers and walls to the goal. In the GUI version, the squares
change color to show learning. The agent may use one of 4 learning
methods: one step Q, Q lambda with traces, sarsa lambda, and dyna.
- Function Approximation - this GUI demo visually shows how
function approximation works.
- Tiles - this GUI demo shows for one particular x,y point the
various tiles of different shapes that the point fits in. Use the space
bar and arrow keys to highlight the various tiles. Clicking in the
graph will show the tiles for the new point you click at.
RLToolkit downloads - Version 1.0:
Last update: November 8, 2011
Changes and Previous Versions
resurrected a version of the toolkit obtained from Anna
-Rich
Requirements:
- Python (version 2.3, also seems to work under 2.7)
- C++ (optional; needed only for the faster version of tiles)
- For the GUI version you must also have the following installed on
your machine:
- Tk
- Tkinter (the Python interface for Tk)
- In addition, you must be
running in a windowing
environment.
- Questions? More
information is
available here.
- For help in using Python, and especially with Tkinter, see here.
Installation:
For either the regular or nonGUI toolkit tar file, do
one of the following:
- Install the toolkit as a standard Python module:
- Download the entire module. It unzips/tars into a folder
called
RLtoolkit.
- Place the RLtoolkit folder in your Python site-packages
folder.
For example:
- Linux Redhat 9: /usr/lib/python2.3/site-packages
- Mac OS X: /Library/Python/2.7/site-packages
- Windows XP: c:\Python23\Lib\site-packages
- If you wish to use the c version of tiles, you must also compile
the c code located in RLtoolkit.CTiles. There is a Macintosh Makefile
available there to help you do this.
- If you have installed the toolkit and it tells you that one of
the packages or modules does not exist, check the __init__.py files for
the package or subpackage involved. If the file is missing, you can add
an empty one by that name, but then imports with "*" may not work as
desired.
Usage
Using the gridworld demo:
After you have downloaded and
decompressed the RLtoolkit.zip file, you should have a folder called
RLtoolkit. You should put this folder where it will be found by the
version of Python that you have running on your machine. On my Mac, it
goes in Macintosh HD>Library>Python>2.7>site-packages. I
figured this out by looking at the older instructions above and below,
and you might do the same if you have a different computer.
Of course, you will need Python to be installed on your machine,
including pythonw for the graphics.
Then, run the demo by the following steps:
1. Start Python by, e.g., typing "pythonw" to a command line such as in
the "terminal" program on a mac.
2. In Python, at the prompt, load the RLtoolkit by typing "from
RLtoolkit import *"
3. Start the Gridworld Demo by typing "demo.demos("gwg", "run")". A
gridworld window should pop up. (You can also run the other demos
described below similarly.)
4. Click anywhere on the gridworld window to switch to Python (but
maybe not on the grid itself because it may make a barrier). You should
now see the gridworld menus, such as "Gridworld" and "Agent".
5. Select a gridworld from the Gridworld menu and an agent from the
Agent menu. Display and set your parameters from the Agent menu. There
is a little display bug in that a newly created gridworld window does
not show its contents. You can see them by going to the Simulation menu
and selecting "Redisplay".
6. Now you are ready to go. Use the buttons at the bottom of the window
to control the simulation and the display.
If you change any of the code, for example in gwguimain.py, you would
save the file, close all your gridworld windows, and then type
"reload(demo.gridworld.gwguimain)" followed by "demo.demos("gwg",
"run")" again to restart using the changed code.
Importing the Whole Toolkit at Once
- To have all the major tools available for use, with module names:
from RLtoolkit import *
makes the major tools (tiles, traces, g, graph, RLinterface,
utilities
)
as well as demo
(which helps run the demos) available.
You must put a module name in front to use them. For the
tools which have their own folders (and thus are not simple modules),
some have a shortcut set up so that
you can use a module name which is
the lower case version of the tool. Then you can do
tiles.loadtiles(...)
or g.gDrawCircle(...) as in previous
versions
.
The graph and graph3d routines should be called with graph.
and graph3d.
(instead of quickgraph.
) Note
that the "extra" modules in each tool package (e.g. the tiles demo, the
g tests) are NOT available with this import.
- To have all the major tools available for use, without module
names:
from RLtoolkit.guiuser import *
then you can just call the tiles routines, g routines, etc by their
names. This loads tiles, the tiles demo, traces, g, graph, graph3d,
RLinterface, the demo function and utilities.
from RLtoolkit.nonguiuser import *
then you can just call the tiles routines etc by their
names. This loads tiles, traces, RLinterface and
utilities.
Importing Specific Tools
- To load the tiles package,
do ONE of the following:
from RLtoolkit.Tiles import *
then you can call the various tile routines by prefixing the modules
names in front. For example, tiles.loadtiles(...)
, tilesdemo.showtiles(...),
fancytiles.diamondtiles(...)
import RLtoolkit.Tiles.tiles as tiles
then you can use the basic tile routines by putting tiles.
in front (e.g. tiles.tiles, tiles.loadtiles, tiles.tileswrap,
tiles.loadtileswrap, tiles.CollisionTable
). The tiles demo and
"fancy" tiles functions are NOT imported or available with this import.
from RLtoolkit.Tiles.tiles import *
then you can use the basic tile routines without having to prefix a
module name. The demo and "fancy" tiles functions are NOT imported or
available with this import.
from RLtoolkit.tiles import *
does exactly the same as the previous import. It is just a shortcut.
- To load the g package,
do ONE of the following:
from RLtoolkit.G import *
then you can call the various g routines by prefixing the module
name in front. For example, tiles.g(...)
import RLtoolkit.G.g as g
then you can use the basic tile routines by putting g.
in front (e.g. g.Gwindow, g.gDrawCircle(...), g.gStartEventLoop
).
The g example and test functions are NOT imported or available with
this import.
from RLtoolkit.G.g import *
then you can use the g functions without having to prefix a
module name. The examples and test functions are NOT imported or
available with this import.
from RLtoolkit.g import *
does exactly the same as the previous import. It is just a shortcut.
- To load the quickgraph package,
do ONE of the following:
from RLtoolkit.Quickgraph import *
then you can call the various graphing routines by prefixing the module
name in front. For example, graph.graph(...),
graph.xTickmarks(...), graph3d.graphSurface(...)
import RLtoolkit.Quickgraph.graph as graph
then you can use the 2d graphing routines by putting graph.
in front (e.g. graph.graph(...), graph.xGraphLimits(...)
).
The graph3d functions are NOT imported or available with this import.
from RLtoolkit.Quickgraph.graph import *
then you can use the graph functions without having to prefix a
module name. The graph3d functions are NOT imported or
available with this import.
from RLtoolkit.graph import *
does exactly the same as the previous import. It is just a shortcut.
import RLtoolkit.Quickgraph.graph3d as graph3d
then you can use the 3d graphing routines by putting graph3d.
in front (e.g. graph3d.graphSurface(...)
). The graph2d
functions are NOT imported or available with this import.
from RLtoolkit.Quickgraph.graph3d import *
then you can use the graph3d functions without having to prefix a
module name. The graph2d functions are NOT imported or
available with this import.
from RLtoolkit.graph3d import *
does exactly the same as the previous import. It is just a shortcut.
- To load any single module tools (RLinterface,
traces, utilities) do ONE of the following:
import RLtoolkit.toolname
as toolname
then you can use the tool's routines by putting toolname.
in front. For example:
import RLtoolkit.RLinterface as rli
rl = rli.RLinterface(...)
from RLtoolkit.toolname
import *
then you can call the tool's routines without prefixing the module
name in front. For example:
from RLtoolkit,traces import *
th = TraceHolder(...)
Running the Demos
- The following demos are available:
- Mountain Car
- GUI version - in
RLtoolkit.examples.mountainDemoG
- Nongui version - in
RLtoolkit.examples.mountainDemoN
- Maintenance Example
(running/maintaining machine to maximize
reward)
- Nongui version only - in
RLtoolkit.examples.maintenanceDemoN
- Gridworld
- GUI version - in
RLtoolkit.gridworld.gwDemoG
- Nongui version - in
RLtoolkit.gridworld.gwDemoN
- Function Approximation
- GUI version only - in
RLtoolkit.fa.demo
- Tiles demo
- GUI version only - in
RLtoolkit.Tiles.tilesdemo
- To run a specific demo:
- Using IDLE, open the specific demo file itself (in folder
examples,
fa,
or gridworld
within RLtoolkit
) and
run
it (F5). The demos (usually) end in DemoN
(for non
graphical) or DemoG
(for GUI). They will automatically
load whatever
tools and other files they need.
- You can import a demo. GUI demos may start automatically, but
if they
don't, use the following:
from RLtoolkit.examples.mountainDemoG import runDemo
then start the demo with runDemo()
from RLtoolkit.fa.demo import faDemo
then start the demo with faDemo()
from RLtoolkit.gridworld.gwDemoG import *
then run with runDemo() or runObjDemo()
from RLtoolkit.Tiles.tilesdemo import showtiles
then run with showtiles(numtilings, memct, floats, title, start,
end, intervals)
- The nongui demos will
print help information on how to run them when they are imported. Note
that the functions described by this
help information require that you prefix them with a module name.
Example imports:
import RLtoolkit.examples.mountainDemoN as mcdemo
then call the functions shown with mcdemo.
in front (e.g.
mcdemo.mcInit()
).
import RLtoolkit.examples.maintenanceDemoN as maint
then call the functions shown with maint.
in front (e.g. maint.maintTest(...)
).
import RLtoolkit.gridworld.gwDemoN
as gw
then call the functions shown with gw.
in front (e.g. gw.gwInit(...)
)
- To use the
demos
function for easy access to the
demos, do ONE
of the following:
- Using IDLE, open the
demo.py
file from within
the
RLtoolkit
folder, and run it (F5). This is the recommended way to run the
demos. Then you can do:
demos()
- prints a list of
available demos
demos('demoname')
- prints information
about the
specific
demo
demos('demoname', 'run')
- loads the
specific demo. If
it
is a GUI demo it will start up automatically. If it is not, help
information on how to run it will be printed.
- If you have imported the whole toolkit (
from
RLtoolkit
import *
), the demos
function is available in
module demo
.
from RLtoolkit import *
demo.demos()
- prints a list of available demos
demo.demos('demoname')
- prints information
about the
specific
demo
demo.demos('demoname', 'run')
- loads the
specific demo. If
it
is a GUI demo it will start up automatically. If it is not, help
information on how to run it will be printed. You must prefix the
commands shown with demo.
(e.g. demo.mcInit()
)
- Import the demos function from the file
demo
as
follows:
import RLtoolkit.demo as demo
- use the
demos
function as described above
(e.g. demo.demos(...)
)
- If you have imported the whole toolkit (
from
RLtoolkit.guiuser
import *
), the demos
function is available with no
module prefix.
Note
that the nongui demos
commands may not be
available. The GUI ones should work though.
from
RLtoolkit.guiuser
import *
demos()
- prints a list of
available demos
demos('demoname')
- prints information
about the
specific
demo
demos('demoname', 'run')
- loads the
specific demo. If
it
is a GUI demo it will start up automatically. If it is not, help
information on how to run it will be printed.
Extend this Page
How to edit
Style
Subscribe
Notify
Suggest
Help
This open web
page hosted at the University
of Alberta.
Terms of use
9313/5