MrHolal
MrHolal

Reputation: 339

DQN unstable predictions

i implemented DQN from scratch in java, everything is custom made. I made it to play snake and results are really good. But i have a problem.

To make network as stable as possible, im using replay memory and also target network. The network is converging really well. But after some time it just breaks.

This is a graph (X - played games, Y - average points scored )

enter image description here

This 'break' happens usually few games after i update target network with policy network.

Settings i use for DQN:

 discount factor: 0.9
 learning rate: 0.001
 steps to update target network: 300 000 (means every 300k steps i update target network with policy)
 replay memory size: 300 000
 replay memory batch size: 256 (every step i take 256 samples from replay memory and train network)

Any ideas what could be wrong? Thanks for answers.

Upvotes: 0

Views: 524

Answers (1)

Thomas Dixon
Thomas Dixon

Reputation: 1

Look up "catastrophic forgetting"

Try adjusting your replay-memory size and the number of steps to update your target network.

Upvotes: 0

Related Questions