Why introduce Markov property to reinforcement learning?

As a beginner of deep reinforcement learning, I am confused about why we should use Markov process in reinforcement learning, and what benefits it brings to reinforcement learning. In addition, Markov process requires that under the "known" condition, the "present" has nothing to do with the "future". Why do some deep reinforcement learning algorithms can use RNN and LSTM? Does this violate the Markov process's assumption?

Upvotes: 2

Answers (2)

Alexandre Krul

Reputation: 55

This assumption says that the current state gives all the information needed about all aspects of the past agent-environment iteraction that makes difference for the future of the system. It is an important definition because you can define the dynamics of the process as p(s',r | s, a). In practice terms, you don't need to look and compute all the previous states of the system to determine the next possible states.

Upvotes: 0

Federico Malerba

Reputation: 815

The Markov property is used for the math to workout in the optimization process. Do keep in mind however that it is much more generally applicable than you might think it is. For example if in a certain board game you need to know the last three states of the game, this might seem as violating the Markov property; however, if you simply redefine your "state" to be the concatenation of the last three states, now you are back in a MDP.

Upvotes: 5

Why introduce Markov property to reinforcement learning?

Answers (2)

Related Questions