Reputation: 11
I'm trying to understand the reward functionality in Breakout atari implemented by Deepmind. I'm a little confused about the reward. They represent every state using four frames and depending on that the reward for every action will be received after four frames. My question is, what if the ball got stuck, where it receives a lot of rewards how to determine that the same action which is rewarded is the cause of this more reward?
The case I'm talking about:
Upvotes: 1
Views: 311