Markov_decision_process Search Results

Markov decision process

Markov decision process (MDP), also called a stochastic dynamic program or stochastic control problem, is a model for sequential decision making when...

34 KB (5,133 words) - 19:34, 11 February 2025

Partially observable Markov decision process

observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it...

22 KB (3,307 words) - 17:50, 19 February 2025

Markov chain

In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability...

96 KB (12,886 words) - 00:59, 3 March 2025

Markov model

to expected rewards. A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially...

10 KB (1,197 words) - 11:49, 30 December 2024

Markov property

probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution...

8 KB (1,124 words) - 20:04, 6 February 2025

Decentralized partially observable Markov decision process

The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a...

3 KB (501 words) - 23:27, 25 June 2024

Andrey Markov

Markov decision process Markov's inequality Markov brothers' inequality Markov information source Markov network Markov number Markov property Markov process...

10 KB (1,072 words) - 15:39, 28 November 2024

Reinforcement learning (category Markov models)

dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dynamic programming...

64 KB (7,487 words) - 10:10, 9 February 2025

List of things named after Andrey Markov

Gauss–Markov theorem Gauss–Markov process Markov blanket Markov boundary Markov chain Markov chain central limit theorem Additive Markov chain Markov additive...

2 KB (229 words) - 07:10, 17 June 2024

Q-learning

this choice by trying both directions over time. For any finite Markov decision process, Q-learning finds an optimal policy in the sense of maximizing...

29 KB (3,835 words) - 20:22, 28 February 2025

Markov reward model

theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding...

3 KB (275 words) - 03:33, 13 March 2024

One-pass algorithm

example of a one-pass algorithm is the Sondik partially observable Markov decision process. Given any list as an input: Count the number of elements. Given...

3 KB (415 words) - 14:59, 12 December 2023

Sequential decision making

using methods like Markov decision processes (MDPs) and dynamic programming. Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic...

1 KB (152 words) - 18:14, 13 December 2024

Multi-armed bandit (redirect from Bandit process)

played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds...

67 KB (7,669 words) - 02:39, 12 February 2025

Machine learning (section Decision trees)

reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcements learning algorithms use dynamic programming...

135 KB (14,996 words) - 18:19, 12 February 2025

State–action–reward–state–action

State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning...

6 KB (716 words) - 19:17, 6 December 2024

Artificial intelligence (section Planning and decision-making)

using decision theory, decision analysis, and information value theory. These tools include models such as Markov decision processes, dynamic decision networks...

275 KB (28,289 words) - 00:31, 2 March 2025

Karl Johan Åström

framework of Markov decision processes with incomplete information, what ultimately led to the notion of a Partially observable Markov decision process. In 1995...

6 KB (551 words) - 16:59, 16 November 2024

Michael Katehakis

University. He is noted for his work in Markov decision process, Gittins index, the multi-armed bandit, Markov chains and other related fields. Katehakis...

10 KB (966 words) - 01:50, 18 January 2025

Learning automaton

reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. Research in learning automata can be traced back...

6 KB (763 words) - 19:33, 15 May 2024

Model-free (reinforcement learning)

probability distribution (and the reward function) associated with the Markov decision process (MDP), which, in RL, represents the problem to be solved. The transition...

6 KB (614 words) - 16:21, 27 January 2025

Monte Carlo tree search (category Optimal decisions)

Adaptive Multi-stage Sampling (AMS) algorithm for the model of Markov decision processes. AMS was the first work to explore the idea of UCB-based exploration...

38 KB (4,606 words) - 07:40, 30 December 2024

Bellman equation (section A dynamic decision problem)

optimality condition in optimal control theory Markov decision process – Mathematical model for sequential decision making under uncertainty Optimal control...

27 KB (4,005 words) - 16:37, 13 August 2024

Stochastic game (redirect from Markov game)

the stage payoffs. Stochastic games generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic...

16 KB (2,438 words) - 13:00, 17 February 2025

Intrinsic motivation (artificial intelligence)

time are represented by probability distributions describing a markov decision process and the cycle of perception and action treated as an information...

17 KB (1,910 words) - 17:46, 10 February 2025

Planning Domain Definition Language

allows efficient description of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) by representing everything...

30 KB (3,469 words) - 00:04, 7 January 2025

Exponential backoff

which is, for the example, E(3) = 3.5 slots. Control theory Markov chain Markov decision process Tanenbaum & Wetherall 2010, p. 395 Rosenberg et al. RFC3261...

23 KB (3,340 words) - 16:29, 25 December 2024

Multiscale decision-making

multi-agent influence diagrams, in particular dependency graphs, and Markov decision processes to solve multiscale challenges in sociotechnical systems. MSDT...

5 KB (497 words) - 08:04, 18 August 2023

Directed information

For the Markov decision process, the main ideas is to transform the optimization into an infinite horizon average reward Markov decision process. For a...

18 KB (3,068 words) - 03:59, 20 January 2025

Keith W. Ross

American scholar of computer science whose research has focused on Markov decision processes, queuing theory, computer networks, peer-to-peer networks, Internet...

5 KB (401 words) - 07:30, 13 September 2024