Markov decision process
Appearance
A Markov Decision Process (MDP) is a discrete time stochastic control process characterized by a set of states, actions, and transition probability matrices that depend on the actions chosen within a given state. MDPs are extremely useful for studying a wide range of optimization problems solved via dynamic programming and reinforcement learning.
External links
- MDP Toolbox for Matlab - An excellent tutorial and Matlab toolbox for working with MDPs.
References
- Bellman, R. E. Dynamic Programming. Princeton University Press, Princeton, NJ.
- M. L. Puterman. Markov Decision Processes. Wiley, 1994.