I am training a DQN and the Q-value keeps going down. The curve looks very weird (see below). Every step corresponds to an update to target network. Any possible reason why this happens?

Reputation: 1

Q-value keeps stepping down when training a DQN

I am training a DQN and the Q-value keeps going down. The curve looks very weird (see below).

enter image description here

Every step corresponds to an update to target network. Any possible reason why this happens?

Upvotes: 0

Reputation: 153

Does the step correspond to the Target Q network update? If so try to:

1) update the TargetQ network less frequently

2) increase the discount factor (e.g. to .99 if you were using .5)

3) use a smooth update for the TargetQ network in the form (1 - tau)old + tauv1

Upvotes: 1