RLAI Reinforcement Learning and Computer Go (RLGO)
Uncertainty

The ambition of this web page is to define and discuss the meaning of uncertainty in the game of Go.



In Go many things are uncertain. But what do we mean by this?

Normally we mean that we are making a prediction, because we cannot evaluate something perfectly. This could be due to:


For now we assume that the opponent is playing according to some fixed, stochastic policy πT and that we are playing according to some fixed, stochastic policy πU (them and us policies). The opponent policy need not be known.

To represent a prediction we must ask a question that we wish to answer. To evaluate the prediction, we play out a game following the policies specified by the question, and see what answer actually results. To improve the estimate we average over many evaluations (in other words playing out the position many times).

[FOOTNOTE: in real games, the opponent's policy may also vary according to their model of our own policy. However, we could incorporate this into the above framework by including opponent model in the state. The policy then remains fixed with respect to this new state.]



Probability and expectation


We often use terms such as probability and expectation when discussing Go. But what does this really mean?

When we refer to the probability of an event occurring, we are normally describing the question of whether a binary observation will become 1 at any point in the future. But once we view the probability as a question, it is clear that we must specify more information: who is to play, and what policies will be followed by us and them, and what timescale we are interested in. Without these specifications the term probability is ill-defined!

When we refer to the expectation of a value, we are normally describing the expected outcome of that value. Again, we can view this as answering a question - after all this is the definition of a question! However, we must again specify more information: who is to play, and what policies will be followed by us and them, and what timescale we are interested in. Without these specifications the term expectation is ill-defined!

In general, the question framework forces us to be more precise with our descriptions of uncertainty than when we use the vague but familiar terms probability and expectation.


Extend this Page   How to edit   Style   Subscribe   Notify   Suggest   Help   This open web page hosted at the University of Alberta.   Terms of use  707/0