Jump to content

Multinomial test

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Ted7815 (talk | contribs) at 08:21, 25 April 2008. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In statistics, the multinomial test is the likelihood-ratio test of the null hypothesis that the parameters of a multinomial distribution equal specified values. It is used for categorical data; see Read and Cressie[1].

We begin with a sample of items each of which has been observed to fall into one of categories. We can define as the observed numbers of items in each cell. Hence .

Next, we define a vector of parameters , where . These are the parameter values under the null hypothesis.

The exact probability of the observed configuration under the null hypothesis is given by

Under the alternative hypothesis, each value is replaced by its maximum likelihood estimate and the exact probability of the observed configuration under the alternative hypothesis is given by

The natural logarithm of the ratio between these two probabilities multiplied by is then the likelihood ratio test statistic

If the null hypothesis is true, then as increases, the distribution of converges to that of chi-square with degrees of freedom. However it has long been known (eg Lawley 1956) that for finite sample sizes, the moments of are greater than those of chi-square, thus inflating the probability of type I errors (false positives). The difference between the moments of chi-square and those of the test statistic are a function of . Williams (1976) showed that the the first moment can be matched as far as if the test statistic is divided by a factor given by

In the special case where the null hypothesis is that all the values are equal to (ie it stipulates a uniform distribution), this simplifies to

Subsequently, Smith et al (1981) derived a dividing factor which matches the first moment as far as . For the case of equal values of , this factor is


The null hypothesis can also be tested by using the chi-square approximation statistic

where is the expected number of cases in category under the null hypothesis. This statistic also converges to a chi-square distribution with degrees of freedom when the null hypothesis is true but does so from below, as it were, rather than from above as does, so may be preferable to the uncorrected version of for small samples.


References

  1. ^ Read, T. R. C. and Cressie, N. A. C. (1988). Goodness-of-fit statistics for discrete multivariate data. New York: Springer-Verlag. ISBN 0-387-96682-X.
  • Lawley, D. N. (1956). "A General Method of Approximating to the Distribution of Likelihood Ratio Criteria". Biometrika. 43: 295–303.
  • Smith, P. J., Rae, D. S., Manderscheid, R. W. and Silbergeld, S. (1981). "Approximating the Moments and Distribution of the Likelihood Ratio Statistic for Multinomial Goodness of Fit". Journal of the American Statistical Association. 76: 737–740.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  • Williams, D. A. (1976). "Improved Likelihood Ratio Tests for Complete Contingency Tables". Biometrika. 63: 33–37.