User:Hyponull/Normalization (statistics)
![]() | This is the sandbox page where you will draft your initial Wikipedia contribution.
If you're starting a new article, you can develop it here until it's ready to go live. If you're working on improvements to an existing article, copy only one section at a time of the article to this sandbox to work on, and be sure to use an edit summary linking to the article you copied from. Do not copy over the entire article. You can find additional instructions here. Remember to save your work regularly using the "Publish page" button. (It just means 'save'; it will still be in the sandbox.) You can add bold formatting to your additions to differentiate them from existing content. |
History
[edit]Standard score (Z-score)
[edit]The concept of normalization emerged alongside the study of the normal distribution by Abraham De Moivre, Pierre-Simon Laplace, and Carl Friedrich Gauss from the 18th to the 19th century. As the name “standard” refers to the particular normal distribution with expectation zero and standard deviation one, that is, the standard normal distribution, normalization, in this case, “standardization”, was then used to refer to the rescaling of any distribution or data set to have mean zero and standard deviation one.
While the study of normal distribution structured the process of standardization, the result of this process, also known as the Z-score, given by the difference between sample value and population mean divided by population standard deviation and measuring the number of standard deviations of a value from its population mean, was not formalized and popularized until Ronald Fisher and Karl Pearson elaborated the concept as part of the broader framework of statistical inference and hypothesis testing in the early 20th century.
Student’s t-statistic
[edit]William Sealy Gosset initiated the adjustment of normal distribution and standard score on small sample size. Educated in Chemistry and Mathematics at Winchester and Oxford, Gosset was employed by Guinness Brewery, the biggest brewer in Ireland back then, and was tasked with precise quality control. It was through small-sample experiments that Gosset discovered that the distribution of the means using small-scaled samples slightly deviated from the distribution of the means using large-scaled samples – the normal distribution – and appeared “taller and narrower” in comparison. This finding was later published in a Guinness internal report titled The application of the “Law of Error” to the work of the brewery and was sent to Karl Pearson for further discussion, which later yielded a formal publishment titled The probable error of a mean in the year of 1908. Under Guinness Brewery’s privacy restrictions, Gosset published the paper under the pseudo “Student”. Gosset’s work was later enhanced and transformed by Ronald Fisher to the form that is used today, and was, alongside the names “Student’s t distribution” – referring to the adjusted normal distribution Gosset proposed, and “Student’s t-statistic” – referring to the test statistic used in measuring the departure of the estimated value of a parameter from its hypothesized value divided by its standard error, popularized through Fisher’s publishment titled Applications of “Student’s” distribution.
Feature Scaling
[edit]The rise of computers and multivariate statistics in mid-20th century necessitated normalization to process data with different units, hatching feature scaling – a method used to rescale data to a fixed range – like min-max scaling and robust scaling. This modern normalization process especially targeting large-scaled data became more formalized in fields including machine learning, pattern recognition, and neural networks in late 20th century.
Batch Normalization
[edit]Batch normalization was proposed by Sergey Ioffe and Christian Szegedy in 2015 to enhance the efficiency of training in neural networks.