Sequential decision making

Sequential decision making is a concept in control theory and operations research, which involves making a series of decisions over time to optimize an objective function, such as maximizing cumulative rewards or minimizing costs. In this framework, each decision influences subsequent choices and system outcomes, taking into account the current state, available actions, and the probabilistic nature of state transitions.^[1] This process is used for modeling and regulation of dynamic systems, especially under uncertainty, and is commonly addressed using methods like Markov decision processes (MDPs) and dynamic programming.^[2]

References

^ Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic programming. Wiley series in probability and mathematical statistics. Applied probability and statistics section. New York: Wiley. pp. 1–2. ISBN 978-0-471-61977-2.
^ Bellman, Richard (1958-09-01). "Dynamic programming and stochastic control processes". Information and Control. 1 (3): 228–239. doi:10.1016/S0019-9958(58)80003-0. ISSN 0019-9958.

[1] Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic programming. Wiley series in probability and mathematical statistics. Applied probability and statistics section. New York: Wiley. pp. 1–2. ISBN 978-0-471-61977-2.

[2] Bellman, Richard (1958-09-01). "Dynamic programming and stochastic control processes". Information and Control. 1 (3): 228–239. doi:10.1016/S0019-9958(58)80003-0. ISSN 0019-9958.

[1]

[2]