End-to-end reinforcement learning

In end-to-end reinfocement learning, the end-to-end process, in other words, the whole process from sensors(raw sensor signals or pixels) to motors (actions or motions) in a robot or agent is consisted of a layered or recurrent neural network, and is trained by reinforcement learning.^[1] It has become popular through the learning of ATARI games^[2]^[3] and Alpha-Go^[4].

References

^ Demis Hassabis (2016). Artificial Intelligence and the Future. MIT Press. Online
^ V. Mnih et al. (2013). Playing atari with deep reinforcement learning. Online
^ V. Mnih et al. (2015). Human-level control through deep reinforcement learning. Nature 518, 529–533.[1]
^ D. Silver et al.(2016). Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489. [2]

[Hassabis-1] Demis Hassabis (2016). Artificial Intelligence and the Future. MIT Press. Online

[ATARI1-2] V. Mnih et al. (2013). Playing atari with deep reinforcement learning. Online

[ATARI2-3] V. Mnih et al. (2015). Human-level control through deep reinforcement learning. Nature 518, 529–533.[1]

[Alpha-GO-4] D. Silver et al.(2016). Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489. [2]

[1]

[2]

[3]

[4]