RLAI open web page

	Reinforcement Learning and Computer Go (RLGO)
	Uncertainty

The ambition of this web page is to define and discuss the meaning of uncertainty in the game of Go.

In Go many things are uncertain. But what do we mean by this?

Normally we mean that we are making a prediction, because we cannot evaluate something perfectly. This could be due to:

Opponent policy is unknown
Opponent policy may be stochastic
Our own policy may be stochastic
It may not be computationally feasible to evaluate, even with deterministic policies.

For now we assume that the opponent is playing according to some fixed, stochastic policy πT and that we are playing according to some fixed, stochastic policy πU (them and us policies). The opponent policy need not be known.

To represent a prediction we must ask a question that we wish to answer. To evaluate the prediction, we play out a game following the policies specified by the question, and see what answer actually results. To improve the estimate we average over many evaluations (in other words playing out the position many times).

[FOOTNOTE: in real games, the opponent's policy may also vary according to their model of our own policy. However, we could incorporate this into the above framework by including opponent model in the state. The policy then remains fixed with respect to this new state.]

Probability and expectation

We often use terms such as probability and expectation when discussing Go. But what does this really mean?

When we refer to the probability of an event occurring, we are normally describing the question of whether a binary observation will become 1 at any point in the future. But once we view the probability as a question, it is clear that we must specify more information: who is to play, and what policies will be followed by us and them, and what timescale we are interested in. Without these specifications the term probability is ill-defined!

When we refer to the expectation of a value, we are normally describing the expected outcome of that value. Again, we can view this as answering a question - after all this is the definition of a question! However, we must again specify more information: who is to play, and what policies will be followed by us and them, and what timescale we are interested in. Without these specifications the term expectation is ill-defined!

In general, the question framework forces us to be more precise with our descriptions of uncertainty than when we use the vague but familiar terms probability and expectation.

Extend this Page How to edit Style Subscribe Notify Suggest Help This open web page hosted at the University of Alberta. Terms of use 707/0