Part-based models

Part based models refers to a broad class of detection algorithms used on images, in which various parts of the image are used separately in order to determine if and where an object of interest exists. Among these methods, a very popular one seems to be the constellation model which refers broadly to those schemes which seek to detect a small number of features and their relative positions to then determine whether or not the object of interest is present. These models build on the original idea of Fischler and Elschlager of using the relative position of a few template matches and evolve in complexity in the work of Perona and others. These models will be covered in the constellation models section. To get a better idea of what is meant by constellation model an example may be more illustrative. Say we are trying to detect faces. A constellation model would use smaller part detectors, for instance mouth, nose and eye detectors and make a judgment about whether an image has a face based on the relative positions in which the components fire.

Non-Constellation Models

Many overlapping ideas are included under the title part-based models even after having excluded those models of the constellation variety. The uniting thread is the use of small parts to build up to a an algorithm that can detect/recognize an item (face, car, etc.) Early efforts, such as those by Yuille, Hallinan and Cohen sought to detect facial features and fit deformable templates to them. These templates were mathematically defined outlines which sought to capture the position and shape of the feature. Yuille, Hallinan and Cohen’s algorithm does have trouble finding the global minimum fit for a given model and so templates did occasionally become mismatched.

Later efforts such as those by Poggio and Brunelli focus on building specific detectors for each feature. They use successive detectors to estimate scale, position, etc. and narrow the search field to be used by the next detector. As such it is a part based model, however, they seek more to recognize specific faces rather than to detect the presence of a face. They do so by using each detector to build a 35 element vector of characteristics of a given face. These characteristic can then be compared to recognize specific faces, however cut-offs can also be used to detect whether a face is present at all.