The term softmax for the action selection rule (2.2) is due to Bridle (1990). This rule appears to have been first proposed by Luce (1959). The parameter is called temperature in simulated annealing algorithms (Kirkpatrick, Gelatt and Vecchi, 1983).