Platt scaling

In machine learning, Platt scaling or Platt calibration is a method of transforming the outputs of a classification model into a probability distribution over classes. The method was invented by John Platt in the context of support vector machines,^[1] replacing an earlier method by Vapnik, but can be applied to other classification models.^[2] Platt scaling works by fitting a logistic regression model to a classifier's scores.

Description

Let $f$ be a real-valued function that is used as a binary classifier to predict, for examples $x$ , a label $y$ from the set {+1, -1}, as $y = sign(f (x))$ (disregarding the possibility of a zero output for now). When what is required is instead a probability $P(y =1| x)$ , but the model does not provide this (or gives bad probability estimates), Platt scaling can be used. This method produces probabilities

\mathrm {P} (y=1|x)={\frac {1}{1+\exp(Af(x)+B)}}

,

i.e., a logistic transformation of the classifier scores $f (x)$ .

The (scalar) parameters $A$ and $B$ are estimated using a maximum likelihood method. Platt himself suggested using the Levenberg–Marquardt algorithm to optimize the parameters, but a Newton algorithm was later proposed that should be more numerically stable.^[3]

References

^ Platt, John (1999). "Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods" (PDF). Advances in large margin classifiers. 10 (3): 61–74.
^ Niculescu-Mizil, Alexandru; Caruana, Rich (2005). Predicting good probabilities with supervised learning (PDF). ICML.
^ Lin, Hsuan-Tien; Lin, Chih-Jen; Weng, Ruby C. (2007). "A note on Platt's probabilistic outputs for support vector machines" (PDF). Machine Learning. 68 (3): 267–276.

[platt99-1] Platt, John (1999). "Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods" (PDF). Advances in large margin classifiers. 10 (3): 61–74.

[2] Niculescu-Mizil, Alexandru; Caruana, Rich (2005). Predicting good probabilities with supervised learning (PDF). ICML.

[3] Lin, Hsuan-Tien; Lin, Chih-Jen; Weng, Ruby C. (2007). "A note on Platt's probabilistic outputs for support vector machines" (PDF). Machine Learning. 68 (3): 267–276.

[1]

[2]

[3]

Description

See also

References