- Introduction
- GetTiles (the main routine)
- Example: 32 4x4 Rectangular Grid-Tilings
- Shortcuts for 1 or 2 Floating-Point Variables
- Simple (Single-Call) Variations
- Complex (Multiple-Call) Variations
- Getting Consistent Hashing
- Lisp Versions
- Python Versions (alpha test)
- Safe Versions
- Wrap-around Versions
- Source Code

Tile coding is the key distinguishing computational idea behind CMACs. Implementations of complete CMACs are available from the folks at the University New Hampshire here. Their package is more suited to straight, bundled CMAC use. Here we separate tile coding from full CMACs, which allows it to be more flexibly used, e.g., with eligibility traces. The current software is also simplified and streamlined consistent with its use in reinforcement learning applications. For example, our code is only a few pages in length and there is just one routine to call. In its normal use that routine has no memory and requires no special setup or data types.

This version 2.0 of the tiling software provides an alternate version of the main routine that is "safe" in that it corrects for all hashing collisions, so as to exactly simulate infinite tillings (until it runs out of memory). This version can also be used to gather statistics on hashing collisions even if they are be ignored (as in the standard version). The alternate version requires that an additional hashing table be passed in, and there is greater computational expense.

Next we describe the use of the tile coding routine. This description applies to both the C and LISP implementations except that some of the data handling is a little simpler in LISP. Mostly we assume we are talking about C; the simplifications for LISP users are summarized here. The C++ code assumes the existence of a function "rand()" that produces successive psuedorandom integers, of which only the low-order bytes are used.

The tile coding software provides a single routine, "`GetTiles`":

void GetTiles(

int tiles[], // provided array will contain the returned tile indices

int num_tilings, // number of tile indices to be returned

int memory_size, // total number of possible tiles

float floats[], // array of floating-point vars to be converted to tiles

int num_floats, // number of floating-point variables

int ints[], // array of integer variables to be converted to tiles

int num_ints) // number of integer variables

The overall idea here is as follows. The first argument (`tiles`)
is the output of the routine. This array is provided by the caller and
filled up by `GetTiles` with tile indices (integers in [0,`memory_size`)).
The second argument is the number of tile indices returned, so the
provided
array had better be of at least this length. It is recommended by
the folks at UNH (who designed the key algorithm here) that
`num_tilings` be a power of two.

Next, `memory_size` specifies the largest permitted tile
index.
The next two arguments are the primary
inputs to `GetTiles`. They specify the floating-point
variables which are to be tiled, in an array with an accompanying
integer argument to indicate how many of them are in the array.
The routine will overlay grid-tilings on these variables, generalizing
across values that differ by less than 1.0.

The last two arguments are optional. For each different
value of the `num_ints` integers in `ints` the tiles
will be hashed totally differently, such that there
is no generalization to any other value of the integers. These
inputs are often used to ensure different hashings for
different groups of tilings, e.g., for different actions in
reinforcement learning.
There are some illustrative examples below.
Alternatively, if there are three such integers or less, they can be
provided as simple integer arguments to `GetTiles` (in place of
the
last two arguments as described above). These cases will be intercepted
by a specialized version of `GetTiles`, the integers packed
into an
array automatically, and finally the base version of `GetTiles`,
as described above, will be called. There is a similar shortcut for the
case
with one or two floating-point variables, described below.

Next we describe examples, gradually increasing in complexity, of the use of the tiles routine.

float float_array[2];

float_array[0] = x / 0.25;

float_array[1] = y / 0.25;

Despite this fairly
broad generalization, we would like the ability to make fine
distinctions if required.
For this we need many tilings, say 32 (a power of two is good here).
This will give us an ultimate resolution of .25/32=.0078, or better
than 1%.
To arrange for this we pass the tiles routine an integer array, `tiles_array`,
of at least 32 elements, to hold the returned
results, and pass 32 as the
`num_tilings` argument.

int tiles_array[32];

Finally, we have to select `memory_size`, the range
of tile indices that will be hashed down into. First we consider the
"natural" memory size, as if we were doing no hashing. This is simply
the total number of tiles in all the tilings,
or 4*4*32=512 in our case. This is not a large number and might be used
directly as the `memory_size`.
Note, however, that this does not mean there will be no hashing or hash
collisions.
This routine always hashes and so collisions are possible. You could
make them very unlikely
by using a large memory size, say 5120, or even 512000 if you have
enough memory for
how the tiles are to be used. More often the natural memory size is too
large, say 10^10, and we must hash into a smaller memory size.

Thus, our final call to `GetTiles` routine might be

GetTiles(tiles_array,32,512,float_array,2);

The returned tiles can be used in many ways, as is discussed elsewhere. But let
us complete this simple example by showing one use. To do simple linear
function approximation
one uses an array of weights of number equal to the total number of
possible tiles
(`memory_size`).
The approximate function value for the input vector of real values is
obtained
simply by summing the weights of the corresponding returned tiles:

float weights[512];Finally, for learning, assume we have a value

float result=0.0;

for (i=0;i<32;i++) result += weights[tiles_array[i]];

float target;

float alpha=0.1; // step-size parameter

for (i=0;i<32;i++) weight[tiles_array[i]] += alpha/32 * (target - result);

For simple uses of `GetTiles`, packing into arrays is tiresome,
so we provide routines to do the packing for you for the cases
with 1 or 2 floating-point state variables, called `GetTiles1`
and `GetTiles2`.
For example, the 4x4 grid case described above could be done without
explicitly
packing into the `float_array` by calling

GetTiles2(tiles_array,32,512,x/.25,y/.25);

There are a number of useful simple variations on the example described
above.
Obviously, one could scale the two dimensions differently to get
rectangular
rather than square tiles. One could also stretch and distort the
dimensions, for example,
making one logarithmic. Or one could obtain a rotated grid by linear
transformations
on the original *x* and *y* variables. For example, to get
a grid oriented along the diagonals, we construct the `float_array`
like this:

float_array[0] = (x+y) / 0.25;

float_array[1] = (x-y) / 0.25;

Note that the tiling produced in all these cases is not restricted to
any region of
the space. It extends in all directions to infinity; it is a tiling of
the infinite
plane. These infinite number of tiles are reduced to `memory_size`
in number
by the hashing.

If you want to restrict the input vector to a limited region, then
you must do so
yourself before calling `GetTiles`. If you want to achieve a
wrap-around
effect, then you should cause wrap-around when you construct the `float_array`,
e.g., by dividing by the width of the region and filling `float_array`
with the remainder.

Sometimes one can get by with a single kind of tiling, as produced by a
single call
to `GetTiles` (note that a single call creates multiple
tillings, but these are
all simple offset copies of each other). But for many other cases one
needs to be a little more sophisticated.
A great deal can be done to control the generalization and to limit the
memory requirements
by using multiple tilings through multiple calls to `GetTiles`.
We consider some of
these below.

Suppose as previously you have input vectors in the unit square, but
you want to generalize
between input vectors that are similar along the *x* dimension OR
similar along the *y*
dimension. To do this you need *two* kinds of tilings, one
producing stripes horizontally and one vertically. Suppose you want to
make 32 tilings of each type, and combine the results
in one 64-element `tiles_array`:

int tiles_array[64];The second call to

GetTiles1(tiles_array,32,4*32*2,x/0.25,0);

GetTiles1(&tiles_array[32],32,4*32*2,y/0.25,1);

There are a lot of possibilities here. One could use two groups of
tilings, one for each single dimension, as illustrated above, PLUS an
additional group of tilings that depends on both dimensions. This would
create a system that generalizes partly along each dimension
separately, but also on the two dimensions conjunctively, so that it
could if needed
learn to give a special response to a particular *pair* of *x,y*
values.
Another possibility would be to choose some groups of tilings with
coarse generalization and
some with fine.

One can control the tendency to generalize in the different ways
given by each group of tilings by controlling the number of tilings in
each group. For example, if you wanted
discrimination to be stronger along *x* rather than *y*,
then you would ask for more *x* tillings
(more than 32) in the above example, or fewer *y* tilings.

In many applications one wants to make sure hashing, a random process, is done in a consistent way. The hashing done by the routines described here is determined by some tables filled with random numbers the first time the routine is called. To get consistent hashing you need only ensure that the random number generator is in a consistent state for the first call. A convenient way to do this is to make a dummy first call during an initialization portion of your program. For example, an early portion of your C++ code could include:

int dummy_tiles[1];In lisp, you would use an implementation-dependent procedure to set

float dummy_vars[1];

srand(0);

GetTiles(dummy_tiles,1,1,dummy_vars,0);

The LISP function "`get-tiles`" returns a list of tile indices.
It is called as follows:

(get-tilesAll the inputs have the same meanings and uses as described above. Multiple calls are combined by appending (or nconcing) the resultant lists into one list.

num-tilings ; the number of tilings desired

memory-size ; the number of possible tile indices

floats ; a list of real values making up the input vector

ints) ; optional list of integers to get different hashings

The function "`load-tiles`" is also available as a maximally
efficient interface.
It loads a portion of a provided array with the returned tile indices.
It is called as follows:

(load-tiles

tiles ; a provided array for the tile indices to go into

starting-element ; first element of "tiles" to be changed (typically 0)

num-tilings ; the number of tilings desired

memory-size ; the number of possible tile indices

floats ; a list of real values making up the input vector

ints) ; list of optional inputs to get different hashings

The Python function "`getTiles`" returns a tuple of tile
indices. It is called as follows:

getTiles(numTilings, memorySize, floats, ints)All the inputs have the same meanings and uses as described above. Multiple calls are combined by concatenating the resultant tuples into one.

The function "`loadTiles`" is also available as a maximally
efficient interface.
It loads a portion of a provided array with the returned tile indices.
It is called as follows:

loadTiles(tiles, startingElement, numTilings, memorySize, floats, ints)with meanings as in the lisp calls.

With the code described so far, there is a small probability that unrelated inputs will hash into some of the same tiles. In a group of tilings, usually there will be no more than one such "collision", so that it is not a big problem; the learning process will sort it out. There will not be a big effect on performance unless the memory is too small or the hash functions are poorly designed. Nevertheless, the possibility of such a problem is annoying. When one's program doesn't work, there is a tendency, deserved or not, to suspect a failure of the hashing function.

To alleviate this kind of concern, we provide another "safe" version
of the routine that
does the additional work to prevent hashing collisions -- or rather to
detect them and
take the necessary corrective steps. The result is a guaranteed exact
emulation
of the infinite tilings. Every different tile used will have a
different index. If more tiles are used than there is space within the `memory_size`,
then an error
is signaled. Of course there is also a cost in terms of additional
memory and computation.

The additional memory is contained in an additional data structure, called a "collision table". To create and use a collision table in the original 4x4 example:

collision_table ct(512,1);In the declaration of the collision table, the 512 is the size of the table. The size must be a power of two. The 1 in the declaration indicates that this is to be a "safe" table; provide a 0 here if you just want to collect statistics on a regular table. Notice that the collision table is then used in place of the

float float_array[2];

int tiles_array[32];

float_array[0] = x / 0.25;

float_array[1] = y / 0.25;

GetTiles(tiles_array,32,&ct,float_array,2);

You can call `ct.usage()` to determine how many entries in
the table
are currently in use. You can also examine `ct.calls`, `ct.clearhits`,
and `ct.collisions` to determine the number of times this
collision
table has been used (calls), has returned a result without a collision
(clearhits),
and has had collisions (collisions, which may be more than one per
call).

`ct.reset` clears the collision table, making room for a new
set of entries.

In Lisp, you construct a collision-table by calling `make-ct`:

(make-ct

memory-size ; the number of possible tile indices

safety-level) ; one of :safe, :unsafe, or :super-safe

which returns the newly initialized collision-table.
The `memory-size` here must be a power of two, e.g., 1024;
otherwise it is
the same `memory-size` as before. If the `safety-level`
is `:unsafe`, then
the collision table is used only to collect statistics; collisions are
still allowed
and not corrected for. Normally, a `safety-level` of `:safe`
is adequate, but `:super-safe` is available to be absolutely
sure.

Once a collision table has been constructed, it is used in any of
the routines
previously described simply by passing it in in place of the `memory-size`
input to the routine. For example, a lisp version of the 32 4x4 tilings
example, using the safe version, would look like
this:

(setq ctable (make-ct 512 :safe)) ; this is done once

(get-tiles 32 ctable (list (/ x .25) (/ y .25))) ; this is done for each x,y

In Python, you obtain a collisionTable by calling `collisionTable(memorySize,
safetyLevel)`, with meanings as above except `safetyLevel`

is one of 'safe', 'unsafe', or 'superSafe'.
Once a collision table has been constructed, it is used in any of the
routines
previously described simply by passing it in in place of the `memorySize`
input to the routine. For example, a python version of the 32 4x4
tilings example, using the safe version, would
look like this:

ctable = collisionTable(512, 'safe') # this is done once

getTiles(32, ctable, (x/.25, y/.25)) # this is done for each x,y

The tilings we have discussed so far stretch out to infinity with no need to specify a range for them. This is cool, but sometimes not what you want. Sometimes you want the variables to wrap-around over some range. For example, you may have an angle variable that goes from 0 to 2pi and then should wrap around, that is, you would like generalization to occur between angles just greater than 0 and angles that are nearly 2pi. To accomodate this, we provide some alternative, wrap-around versions of the tiling routines.

These versions take an additional input, `wrap-widths`,
which parallels the float structure (array or list), and which
specifies for each float the width of the range over which it wraps
(e.g., 2pi). If you don't want a float to wrap, it's wrap-width should
be zero (in LISP, nil). The wrapping width is in the same units as the
floats, but should be an integer. Specifying the complete range of
wrapping (rather than just the width) is not necessary for the same
reason we did not need to give a range at all in the previous routines.

For example, suppose you wanted to cover the 2pi x 3 rectangular area with 16 tilings, each with a width of generalization one-tenth of the space in each dimension, and with wrap-around in the first dimension but not the second. In C you would do

int tiles_array[16];where

float float_array[2];

int wrap_array[2];

float_array[0] = x / (3 * 0.1);

wrap_array[0] = 0;

float_array[1] = y / (2 * 3.14159 * 0.1);

wrap_array[1] = 10;

GetTilesWrap(tiles_array,16,512,float_array,2,wrap_array);

void GetTilesWrap(

int tiles[], // provided array will contain the returned tile indices

int num_tilings, // number of tile indices to be returned

int memory_size, // total number of possible tiles

float floats[], // array of floating-point vars to be converted to tiles

int num_floats, // number of floating-point variables

int wrap_widths[], // array of widths (length and units as in floats)

int ints[], // array of integer variables to be converted to tiles

int num_ints) // number of integer variables

There is a known bug in the C version - because of an oddity of the C mod function the results are very slightly off as inputs cross zero.

In LISP, the wrapping routines are:

(get-tiles-wrapand

num-tilings ; the number of tilings desired

memory-size ; the number of possible tile indices

floats ; a list of real values making up the input vector

wrap-widths ; list of widths for wrap-around on floats (ints or nil)

ints) ; optional list of integers to get different hashings

(load-tiles-wrap

tiles ; a provided array for the tile indices to go into

starting-element ; first element of "tiles" to be changed (typically 0)

num-tilings ; the number of tilings desired

memory-size ; the number of possible tile indices

floats ; a list of real values making up the input vector

wrap-widths ; list of widths for wrap-around on floats (ints or nil)

ints) ; list of optional inputs to get different hashings

In Python, the wrapping routines are:

getTilesWrap(numTilings, memorySize, floats, wrapWidths, ints)and

loadTilesWrap(tiles, startingElement, numTilings, memorySize, floats, \

wrapWidths, ints)