Good–Turing estimator

The Good–Turing estimator (named after I. J. Good and Alan Turing) is a statistical technique for predicting the probability of occurrence of objects belonging to an unknown number of species, given past observations of such objects and their species.

The case where the number of species is known in advance had been already solved by Bayes and Laplace and is called the Laplace-Bayes estimator or the rule of succession. For example if an urn contains two species of balls (red and black) and after sampling N of them we have found that r are black, then the current estimate for the probability of a black ball is (r+1)/(N+2). Good and Turing consider a more general problem where the number of species is not known in advance; they provide a probability estimate for each of the species known up to the present time, as well as an estimate of the probability that a new, previously unknown, species will be drawn.

Some notation [...]

This statistics-related article is a stub. You can help Wikipedia by expanding it.