Jump to content

Good–Turing estimator

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Encyclops (talk | contribs) at 00:49, 10 December 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The Good–Turing estimator (named after I. J. Good and Alan Turing) is a statistical technique for predicting the probability of occurrence of objects belonging to an unknown number of species, given past observations of such objects and their species.

The case where the number of species is known in advance had been already solved by Bayes and Laplace and is called the Laplace-Bayes estimator or the rule of succession. For example if an urn contains two species of balls (red and black) and after sampling N of them we have found that r are black, then the current estimate for the probability of a black ball is (r+1)/(N+2). Good and Turing consider a more general problem where the number of species is not known in advance; they provide a probability estimate for each of the species known up to the present time, as well as an estimate of the probability that a new, previously unknown, species will be drawn.

Some notation [...]