This is an old revision of this page, as edited by 71.94.235.196(talk) at 22:53, 7 June 2019(removed duplicate sentence). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.Revision as of 22:53, 7 June 2019 by 71.94.235.196(talk)(removed duplicate sentence)
Suppose that we take a sample of size n from each of k populations with the same normal distributionN(μ, σ²) and suppose that min is the smallest of these sample means and max is the largest of these sample means, and suppose s² is the pooled sample variance from these samples. Then the following random variable has a Studentized range distribution.
Note that in the outer part of the integral, the equation
was used to replace an exponential factor.
Cumulative distribution function
The cumulative distribution function is given by [1]
Special cases
If k is 2 or 3,[2] the studentized range probability distribution function can be directly evaluated, where is the standard normal probability density function.
When the degrees of freedom approaches infinity the studentized range cumulative distribution can be calculated for any k using the standard normal distribution.
Uses
Critical values of the studentized range distribution are used in Tukey's range test.
Derivation of the studentized range distribution function
The studentized range distribution function arises from re-scaling the sample range R by the sample standard deviation s, since the studentized range is customarily tabulated in units of standard deviations, with the variable q = R⁄s. The derivation begins with a perfectly general form of the range distribution function, which does not depend on the form of the distribution of the sample data. However, in order to obtain the distribution in terms of q, one must introduce s, assume normality, and then integrate over s in order to remove it.
General form
For any probability density function fX, the range probability density fR is:[2]
What this means is that we are adding up the probabilities that, given k draws from a distribution, two of them differ by r, and the remaining k − 2 draws all fall between the two extreme values.
If we change variables to u where is the low-end of the range, and define FX as the cumulative distribution function of fX, then the equation can be simplified:
We introduce a similar integral, and notice that differentiating under the integral-sign gives
which recovers the integral above, so that relation confirms
The range distribution is most often used for confidence intervals around sample averages, which are asymptotically normally distributed by the central limit theorem.
In order to create the studentized range distribution for normal data, we first switch to φ and Φ for the standard normal distribution from f and F, and change the variable r to s q, where q is a fixed factor that re-scales r by scaling factor s:
Multiplying the distributions fR and fS and integrating to remove the dependence on the standard deviation s gives the studentized range distribution function for normal data:
where
q is the width of the data range measured in standard deviations,
ν is the number of degrees of freedom for determining the sample standard deviation,[b] and
k is the number of data points in the range.
The equation for the pdf shown in the sections above comes from using
to replace the exponential expression in the outer integral.
^Lund, R.E.; Lund, J.R. (1983). "Algorithm AS 190: Probabilities and upper quantiles for the studentized range". Journal of the Royal Statistical Society. 32 (2): 204–210. JSTOR2347300.
Dunlap, W.P.; Powell, R.S.; Konnerth, T.K. (1977). "A FORTRAN IV function for calculating probabilities associated with the studentized range statistic". Behavior Research Methods & Instrumentation. 9 (4): 373–375. doi:10.3758/BF03202264.