Jump to content

L-estimator

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Nbarth (talk | contribs) at 06:21, 14 April 2013 (Examples: location, spread). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In statistics, an L-estimator is an estimator which equals a linear combination of order statistics of the measurements. This can be as little as a single point, as in the median (of an odd number of values), or as many as all points, as in the mean.

The main benefits of L-estimators are that they are extremely simple robust statistics: assuming sorted data, they are very easy to calculate and interpret, and are often resistant to outliers. They thus are useful in robust statistics, as descriptive statistics, in statistics education, and when computation is difficult. However, they are inefficient, and in modern robust statistics M-estimators are preferred, though these are much more difficult computationally. In many circumstances L-estimators are reasonably efficient, and thus adequate for initial estimation.

Examples

A basic example is the median. Given n values , if is odd, the median equals , the -th order statistic; if is even, it is the average of two order statistics: . These are both linear combinations of order statistics, and the median is therefore a simple example of an L-estimator.

A more detailed list of examples includes: with a single point, the maximum, and the minimum; with one or two points, the median; with two points, the mid-range, the range, the trimmed mid-range (including the midhinge), and the trimmed range (including the interquartile range); with three points, the trimean; with a fixed fraction of the points, the trimmed mean and the Winsorized mean; with all points, the mean.

Note that some of these (such as median, or mid-range) are measures of central tendency, and are used as estimators for a location parameter, such as the mean of a normal distribution, while others (such as range or trimmed range) are measures of statistical dispersion, and are used as estimators of spread or scale, such as the standard deviation of a normal distribution.

Robustness

L-statistics are often statistically resistant, having a high breakdown point. This is defined as the fraction of the measurements which can be arbitrarily changed without causing the resulting estimate to tend to infinity (i.e., to "break down"). The breakdown point of an L-estimator is given by the closest order statistic to the minimum or maximum: for instance, the median has a breakdown point of 50% (the highest possible), and a n% trimmed or Winsorized mean has a breakdown point of n%.

Not all L-estimators are robust; if it includes the minimum or maximum, then it has a breakdown point of 0. These non-robust L-estimators include the minimum, maximum, mean, and mid-range. The trimmed equivalents are robust, however.

See also

References

  • Huber, Peter J. (2004). Robust statistics. New York: Wiley-Interscience. ISBN 0-471-65072-2.
  • Shao, Jun (2003). Mathematical statistics. Berlin: Springer-Verlag. pp. sec. 5.2.2. ISBN 0-387-95382-5.
  • Fraiman, Ricardo; Meloche, Jean; Garcia-Escudero, Luis; Gordaliza, Alfonso; He, Xuming; Maronna, Ricardo (1999). "Multivariate L-estimation". TEST (2). Springer Berlin / Heidelberg: 255–317. doi:10.1007/BF02595872. {{cite journal}}: Unknown parameter |Volume= ignored (|volume= suggested) (help); soft hyphen character in |coauthors= at position 10 (help)