Scoring algorithm

In statistics, Fisher's scoring algorithm is a form of Newton's method used to solve maximum likelihood equations numerically.

Sketch of Derivation

Let $Y_{1},\ldots ,Y_{n}$ be random variables, independent and identically distributed with twice differentiable p.d.f. $f(y;\theta )$ , and we wish to calculate the maximum likelihood estimator (M.L.E.) $\theta ^{*}$ of $\theta$ . First, suppose we have a starting point for our algorithm $\theta _{0}$ , and consider a Taylor expansion of the score function, $V(\theta )$ , about $\theta _{0}$ :

V(\theta )\approx V(\theta _{0})-{\mathcal {J}}(\theta _{0})(\theta -\theta _{0}),\,

where

{\mathcal {J}}(\theta _{0})=-\sum _{i=1}^{n}\left.\nabla \nabla ^{\top }\right|_{\theta =\theta _{0}}\log f(Y_{i};\theta )

is the observed information matrix at $\theta _{0}$ . Now, setting $\theta =\theta ^{*}$ , using that $V(\theta ^{*})=0$ and rearranging gives us:

\theta ^{*}\approx \theta _{0}+{\mathcal {J}}^{-1}(\theta _{0})V(\theta _{0}).\,

We therefore use the algorithm

\theta _{m+1}=\theta _{m}+{\mathcal {J}}^{-1}(\theta _{m})V(\theta _{m}),\,

and under certain regularity conditions, it can be shown that $\theta _{m}\rightarrow \theta ^{*}$ .

Fisher scoring

In practice, ${\mathcal {I}}(\theta )$ is usually replaced by ${\mathcal {I}}(\theta )=\mathrm {E} [J(\theta )]$ , the Fisher information, thus giving us the Fisher Scoring Algorithm:

\theta _{m+1}=\theta _{m}+{\mathcal {I}}^{-1}(\theta _{m})V(\theta _{m})

.

References

Jennrich, R. I., & Sampson, P. F. (1976). Newton-Raphson and related algorithms for maximum likelihood variance component estimation. Technometrics, 18, 11-17.

Sketch of Derivation

Fisher scoring

See also

References