Wake-sleep algorithm

The wake-sleep algorithm is an unsupervised learning algorithm for a multilayer neural network (e.g. sigmoid belief network). It is one of the suggested ways of training the Helmholtz Machine^[1]^[2]. The algorithm consists of two learning phases – “wake” and “sleep” that are performed alternately^[3]. It was first designed as a model for brain functioning using Variational Bayesian Learning. After that, the algorithm was adapted to machine learning.

Description

The wake-sleep algorithms is visualized as a stack of layers containing representations of data^[4]. Layers above represent data from the layer below it. Actual data is placed below the bottom layer, causing layers on top of it to become gradually more abstract. Between each pair of layers there is a recognition weight and generative weight, which are trained to improve reliability during the algorithm runtime^[5].

The wake-sleep algorithm is convergent^[6].

Training

Training consists of two phases – “wake” and “sleep”.

The "wake" phase

Neurons are fired by recognition connections (from what would be input to what would be output). Generative connections (leading from outputs to inputs) are then modified to increase probability that they would recreate the correct activity in the layer below – closer to actual data from sensory input^[7].

The "sleep" phase

The process is reversed in the “sleep” phase – neurons are fired by generative connections while recognition connections are being modified to increase probability that they would recreate the correct activity in the layer above – further to actual data from sensory input.

Potential risks

Variational Bayesian Learning is based on probabilities. There is a chance that an approximation is performed with mistakes, damaging further data representations. Another downside pertains to complicated or corrupted data samples, making it difficult to infer a representational pattern.

The wake-sleep algorithm has been suggested not to be powerful enough for the layers of the inference network in order to recover a good estimator of the posterior distribution of latent variables^[8].

References

^ Dayan, Peter; Hinton, Geoffrey E. (1996-11-01). "Varieties of Helmholtz Machine". Neural Networks. Four Major Hypotheses in Neuroscience. 9 (8): 1385–1403. doi:10.1016/S0893-6080(96)00009-3.
^ Dayan, Peter; Hinton, Geoffrey E; Neal, Radford M; Zemel, Richard S (1994-08-24). "The Helmholtz Machine" (PDF). Retrieved 2015-11-01.
^ Katayama, Katsuki; Ando, Masataka; Horiguchi, Tsuyoshi (2004-04-01). "Models of MT and MST areas using wake–sleep algorithm". Neural Networks. 17 (3): 339–351. doi:10.1016/j.neunet.2003.07.004.
^ Maei, Hamid Reza (2007-01-25). "Wake-sleep algorithm for representational learning". University of Montreal. Retrieved 2011-11-01.
^ Neal, Radford M.; Dayan, Peter (1996-11-24). "Factor Analysis Using Delta Rules Wake-Sleep Learning" (PDF). University of Toronto. Retrieved 2015-11-01.
^ Ikeda, Shiro; Amari, Shun-ichi; Nakahara, Hiroyuki. "Convergence of The Wake-Sleep Algorithm" (PDF). The Institute of Statistical Mathematics. Retrieved 2015-11-01.
^ Hinton, Geoffrey; Dayan, Peter; Frey, Brendan J; Neal, Radford M (1995-04-03). "The wake-sleep algorithm for unsupervised neural networks" (PDF). Retrieved 2015-11-01. {{cite web}}: line feed character in |title= at position 42 (help)
^ Bornschein, Jörg; Bengio, Yoshua (2014-06-10). "Reweighted Wake-Sleep". arXiv:1406.2751 [cs].

[1] Dayan, Peter; Hinton, Geoffrey E. (1996-11-01). "Varieties of Helmholtz Machine". Neural Networks. Four Major Hypotheses in Neuroscience. 9 (8): 1385–1403. doi:10.1016/S0893-6080(96)00009-3.

[2] Dayan, Peter; Hinton, Geoffrey E; Neal, Radford M; Zemel, Richard S (1994-08-24). "The Helmholtz Machine" (PDF). Retrieved 2015-11-01.

[3] Katayama, Katsuki; Ando, Masataka; Horiguchi, Tsuyoshi (2004-04-01). "Models of MT and MST areas using wake–sleep algorithm". Neural Networks. 17 (3): 339–351. doi:10.1016/j.neunet.2003.07.004.

[4] Maei, Hamid Reza (2007-01-25). "Wake-sleep algorithm for representational learning". University of Montreal. Retrieved 2011-11-01.

[5] Neal, Radford M.; Dayan, Peter (1996-11-24). "Factor Analysis Using Delta Rules Wake-Sleep Learning" (PDF). University of Toronto. Retrieved 2015-11-01.

[6] Ikeda, Shiro; Amari, Shun-ichi; Nakahara, Hiroyuki. "Convergence of The Wake-Sleep Algorithm" (PDF). The Institute of Statistical Mathematics. Retrieved 2015-11-01.

[7] Hinton, Geoffrey; Dayan, Peter; Frey, Brendan J; Neal, Radford M (1995-04-03). "The wake-sleep algorithm for unsupervised neural networks" (PDF). Retrieved 2015-11-01. {{cite web}}: line feed character in |title= at position 42 (help)

[8] Bornschein, Jörg; Bengio, Yoshua (2014-06-10). "Reweighted Wake-Sleep". arXiv:1406.2751 [cs].

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]