Dirk
Dirk

Reputation: 608

Can state in Proximal Policy Optimization contain history?

For example can the State at timestep t actually be made of the state at t and t-1.

S_t = [s_t, s_t-1]

i.e. Does Proximal Policy Optimization already incorporate the state history, or can it be implicit in the State (or neither).

Upvotes: 1

Views: 264

Answers (1)

BadProgrammer
BadProgrammer

Reputation: 371

You could concatenate your observations. This is very common to do it RL. Usually in atari domain the last four frames are joined into a single observation. This makes it possible for the agent to understand change in the environment.

a basic PPO algorithm does not by default implicitly keep track of state history. You could make this possible though by adding a recurrent layer.

Upvotes: 1

Related Questions