Jump to content

Sequential decision making

From Wikipedia, the free encyclopedia
This is the current revision of this page, as edited by Citation bot (talk | contribs) at 18:14, 13 December 2024 (Altered pages. Formatted dashes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Linked from User:Mathbot/Most_wanted_redlinks | #UCB_webform_linked 253/310). The present address (URL) is a permanent link to this version.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Sequential decision making is a concept in control theory and operations research, which involves making a series of decisions over time to optimize an objective function, such as maximizing cumulative rewards or minimizing costs. In this framework, each decision influences subsequent choices and system outcomes, taking into account the current state, available actions, and the probabilistic nature of state transitions.[1] This process is used for modeling and regulation of dynamic systems, especially under uncertainty, and is commonly addressed using methods like Markov decision processes (MDPs) and dynamic programming.[2]

References

[edit]
  1. ^ Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic programming. Wiley series in probability and mathematical statistics. Applied probability and statistics section. New York: Wiley. pp. 1–2. ISBN 978-0-471-61977-2.
  2. ^ Bellman, Richard (1958-09-01). "Dynamic programming and stochastic control processes". Information and Control. 1 (3): 228–239. doi:10.1016/S0019-9958(58)80003-0. ISSN 0019-9958.