User:S0824321/sandbox

Hough forests are a combination of random forests and the generalised Hough transform which can be used in computer vision to detect and track objects. Using Hough forests to detect objects was first proposed by Juergen Gall and Victor Lempitsky in 2009^[1].

Using Hough forests for object recognition requires two main steps:

training the model by building numerous random trees from many labelled examples;
and passing the object to be searched for through each of the trees to find the area that object is most likely to be found in.

While training the model, a boolean function is associated with each of the internal nodes and a posterior probability with the leaves in the individual trees so that samples of objects to be classified can be passed through the tree from root to leaf to arrive at probability that the object is present in the image or not.

Training the Model

Training the model requires a large set of labelled positive and negative examples with the positive examples surrounded by a D-dimensional bounding box that represents the location of the object within the training image. For object detection on single static images a dimension of 2 is enough whereas for moving objects the dimension needs to take time into account so requires a dimension of at least 3. The random trees required for the Hough forest method are built from randomly sampling the training images into smaller sub-images and applying a label to them. These labels takes the form of:

{\mathcal {P}}_{i}=({\mathcal {I}}_{i},c_{i},d_{i})

where:

${\mathcal {I}}_{i}$ represents the appearance of the object. This is required as this process needs to find objects that may appear to look different but are actually the same object, for example, the object is under different lighting to the example image of that object.
$c_{i}$ represents the class the image belongs to. For positive examples the class is assigned to a value of 1 and for negative examples the class is assigned to a value of 0.
$d_{i}$ represents the distance between the random subsample and the centre of the training image.

Once these labels have been created, the random trees are then generated using the following recursive algorithm:

Algorithm TreeBuilder(images, root):
   if at max height or there are less than the minimum number of images left:
       assign the current node a probability based on the proportion of positive and negative examples
       return root
   choose and assign a randomly selected binary relation to root
   split the images based on the relation selected and create child nodes
   return TreeBuilder(images, left child) + TreeBuilder(images, right child)

"←" denotes assignment. For instance, "largest ← item" means that the value of largest changes to the value of item.
"return" terminates the algorithm and outputs the following value.

The key step in this algorithm is the assignment of the probability to all the leaves in the tree. As the examples reach the same leaf, then they must have successfully passed all of the same binary tests and can therefore be said to be similar. The probability is then based on the proportion of the positive to negative examples that make it through the same path to reach the same leaf.

Binary Tests

Binary tests can take the form of anything that will partition the set of images into two subsets based on whether a particular image has the desired property or not. Some examples of binary tests include whether there are a number of edges present, whether there are corners present, the colours present, and whether there are particular optical flows present in the image.

Juergen Gall and Victor Lempitsky in their paper^[1] proposed comparing the similarity between pixel intensities at two points, (p,q) and (r,s), using the test:

t_{a,p,q,r,s,\tau }={\begin{cases}0,&{\mbox{if }}I^{a}(p,q)<I^{a}(r,s)+\tau .\\1,&{\mbox{if }}otherwise\\\end{cases}}

This test partitions the set of training images into images that have similar intensities.

Object Detection

Like the generalised Hough transform the detection phase of the algorithm requires a Hough image to be calculated. This Hough image is created by calculating the number of votes cast by the individual areas of the image which model the probability that the object is in the particular area.

The difference between the generalised Hough transform and the Hough forest method is in the calculation of the number of votes. First, the test image is split into individual patches of a fixed size. Then the probability that a patch contains part of the class the system is searching for is:

p(E(x)|{\mathcal {I}}(y))

This probability models the chance that the object being searched for is at location x given the properties of the test image at location y.

When passing the patch through a single tree ${\mathcal {T}}$ the number of votes for the Hough image can be calculated using the formula:

p(E(x)|{\mathcal {I}}(y);{\mathcal {T}})=[{\frac {1}{D_{L}}}\sum _{d\in D_{L}}{\frac {1}{2\pi \sigma ^{2}}}\exp {(-{\frac {||(y-x)-d||^{2}}{2\sigma ^{2}}})}]\bullet C_{L}

To get the final value for votes a patch receives the patch is run through each of the random trees in the Hough forest and the average is taken.

To combine the votes for the individual patches into a single Hough image the votes for each pixel are added together. To actually detect the areas where the object lie, the maximum areas are calculated and a bounding box is drawn around the areas of maximum number of votes.

Advantages of Using Hough Forests

There are many advantages to using random Hough forests for object detection and object tracking:

They are efficient to learn and apply.
They allow for more detail to be explored and hence have high accuracy.
They can also be used for online learning to learn and react as more information becomes available.

Applications of Hough Forests

Hough forests are fast and efficient and can be used for:

Object detection
Object tracking
Action recognition

References

^ ^a ^b Gall, Jeurgen (2009). Class-specific hough forests for object detection (PDF). In Proceedings IEEE Conference Computer Vision and Pattern Recognition.

[Gall-1] Gall, Jeurgen (2009). Class-specific hough forests for object detection (PDF). In Proceedings IEEE Conference Computer Vision and Pattern Recognition.

[1]