Thresholding (image processing)

Thresholding is the simplest method of image segmentation. Individual pixels in a grayscale image are marked as 'object' pixels if their value is greater than some threshold value (assuming an object to be brighter than the background) and as 'background' pixels otherwise. Typically, an object pixel is given a value of '1' while a background pixel is given a value of '0'. The key parameter here is obviously the choice of the threshold. Several different methods for choosing a threshold exist. The simplest method would be to choose the mean or median value, the rationale being that if the object pixels are brighter than the background, they should also be brighter than the average. In a noiseless image with uniform background and object values, the mean or median will work beautifully as the threshold, however generally speaking, this will not be the case. A more sophisticated approach might be to create a histogram of the image pixel intensities and use the valley point as the threshold. The histogram approach assumes that there is some average value for the background and object pixels, but that the actual pixel values have some variation around these average values. However, computationally this is not as simple as we'd like, and many image histograms do not have clearly defined valley points. Ideally we're looking for a method for choosing the threshold which is simple, does not require too much prior knowledge of the photo, and works well for noisy images. Perhaps the best such approach is an iterative method, as follows:

An initial threshold (T) is chosen, this can be done randomly or according to any other method desired.
The image is segmented into object and background pixels as described above, creating two sets:
1. G1 = {f(m,n):f(m,n)>T} (object pixels)
2. G2 = {f(m,n):f(m,n)<=T} (background pixels) (note, f(m,n) is the value of the pixel located in the m column, n row)
The average of each set is computed.
1. m1 = average value of G1
2. m2 = average value of G2
A new threshold is created that is the average of m1 and m2
1. T' = (m1 + m2)/2
Go back to step two, now using the new threshold computed in step 4, keep repeating until the new threshold matches the one before it (i.e. until convergence has been reached).

This iterative algorithm is a special one-dimensional case of the k-means clustering algorithm, which has been proven to converge at a local minimum - meaning that a different initial threshold may result in a different final result.