Jump to content

End-to-end reinforcement learning

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 111.64.127.202 (talk) at 17:52, 15 October 2016. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In end-to-end reinfocement learning, the end-to-end process, in other words, the whole process from sensors(raw sensor signals or pixels) to motors (actions or motions) in a robot or agent is consisted of a layered or recurrent neural network, and is trained by reinforcement learning.[1] It has become popular through the learning of ATARI games[2][3] and Alpha-Go[4].

References

  1. ^ Demis Hassabis (2016). Artificial Intelligence and the Future. MIT Press. Online
  2. ^ V. Mnih et al. (2013). Playing atari with deep reinforcement learning. Online
  3. ^ V. Mnih et al. (2015). Human-level control through deep reinforcement learning. Nature 518, 529–533.[1]
  4. ^ D. Silver et al.(2016). Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489. [2]