Capsule neural network

A capsule neural network is an artificial neural network (ANN) of a kind that builds on inspirations from cortical minicolumns in the cerebral cortex. In Geoffrey Hintons original idea one minicolumn would represent and detect one multidimensional entity.^{[note 1]} Neurons in a normal ANN will normally output an activation that in some sense represent a probability of an observation, while a capsule will output both a probability for an observation and a generalized pose for the observation. This pose can include such things as position, orientation, scale, and other properties. If the entity does not exist the given properties will be neglected.

The important difference is that capsules in higher layers takes vectors of capsules from lower layers and extract those that contains tight clusters. When it finds a cluster the higher capsule will output a high probability that there is an entity at the location, and it will also output a generalized pose for the probable entity. The generalized pose can be 20-50 dimensional. Because a capsule detects a multidimensional entity, a cluster of such capsules does not happen by chance, thus it is far better than ordinary artificial neurons.

Even for a cluster of two capsules with a six-dimensional entity, then if they agree to within 10% there is a chance in a million that it would happen by chance. A larger cluster with higher dimensions would then be similarly less likely to happen by chance.

It is important that the higher capsule neglect outliers, what's important is the small subsets of tightly connected clusters. This is similar to Hough transform, the RHT and RANSAC from classic image processing.

Notes

^ In Hinton'sown words this is wild speculations, but it is a quite good speculation on what the minicolumns are doing.

References

[1] In Hinton'sown words this is wild speculations, but it is a quite good speculation on what the minicolumns are doing.

[note 1]