Probability-proportional-to-size sampling

In survey methodology, probability-proportional-to-size (pps) sampling is a sampling process where each element of the population (of size N) has some (independent) chance $p_{i}$ to be selected to the sample when performing one draw. This $p_{i}$ is proportional to some known quantity $x_{i}$ so that $p_{i}={\frac {x_{i}}{\sum _{i=1}^{N}x_{i}}}$ .^[1]^: 97^[2]

One of the cases this occurs in, as developed by Hanson and Hurwitz in 1943,^[3] is when we have several clusters of units, each with a different (known upfront) number of units, then each cluster can be selected with a probability that is proportional to the number of units inside it.^[4]^: 250 So, for example, if we have 3 clusters with 10, 20 and 30 units each, then the chance of selecting the first cluster will be 1/6, the second would be 1/3, and the third cluster will be 1/2.

The pps sampling results in a fixed sample size n (as opposed to Poisson sampling which is similar but results in a random sample size with expectancy of n). When selecting items with replacement the selection procedure is to just draw one item at a time (like getting n draws from a multinomial distribution with N elements, each with their own $p_{i}$ selection probability). If doing a without-replacement sampling, the schema can become more complex.^[1]^: 93

Distribution and properties

If observations from some distribution F are sampled in a way that is proportional to their value, then the distribution of the values in that sample follows a Length-biased distribution, with the following density function:^[5]^: 2^[6]

$g(x)=xf(x)/E[x]$

Also: $E[g(x)]=E[x^{2}]/E[x]$

References

^ ^a ^b Carl-Erik Sarndal; Bengt Swensson; Jan Wretman (1992). Model Assisted Survey Sampling. ISBN 978-0-387-97528-3.
^ Skinner, Chris J. "Probability proportional to size (PPS) sampling." Wiley StatsRef: Statistics Reference Online (2014): 1-5. (link)
^ Hansen, Morris H., and William N. Hurwitz. "On the theory of sampling from finite populations." The Annals of Mathematical Statistics 14.4 (1943): 333-362.
^ Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Nashville, TN: John Wiley & Sons. ISBN 978-0-471-16240-7
^ Mustafa, Abdelfattah; Khan, M.I. (2022) : The length-biased power hazard rate distribution: Some properties and applications, Statistics in Transition new series (SiTns), ISSN 2450-0291, Sciendo, Warsaw, Vol. 23, Iss. 2, pp. 1-16, https://doi.org/10.2478/stattrans-2022-0013
^ Lee, Kyeongjun. "Estimation of length biased exponential distribution based on progressive hybrid censoring." Communications for Statistical Applications and Methods 31.6 (2024): 661-675.

This statistics-related article is a stub. You can help Wikipedia by expanding it.

[sarndal1992-1] Carl-Erik Sarndal; Bengt Swensson; Jan Wretman (1992). Model Assisted Survey Sampling. ISBN 978-0-387-97528-3.

[2] Skinner, Chris J. "Probability proportional to size (PPS) sampling." Wiley StatsRef: Statistics Reference Online (2014): 1-5. (link)

[3] Hansen, Morris H., and William N. Hurwitz. "On the theory of sampling from finite populations." The Annals of Mathematical Statistics 14.4 (1943): 333-362.

[Cochran1977-4] Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Nashville, TN: John Wiley & Sons. ISBN 978-0-471-16240-7

[5] Mustafa, Abdelfattah; Khan, M.I. (2022) : The length-biased power hazard rate distribution: Some properties and applications, Statistics in Transition new series (SiTns), ISSN 2450-0291, Sciendo, Warsaw, Vol. 23, Iss. 2, pp. 1-16, https://doi.org/10.2478/stattrans-2022-0013

[6] Lee, Kyeongjun. "Estimation of length biased exponential distribution based on progressive hybrid censoring." Communications for Statistical Applications and Methods 31.6 (2024): 661-675.

[1]

[2]

[3]

[4]

[5]

[6]

Distribution and properties

See also

References