Learning of DQN with noise data

Question

I'm trying some experiments with DQN in a simple navigation task with binary rewards at the end of the episode. DQN is working perfectly well. Now I,m thinking of perturbing the reward, which means 10% of the time the binary reward is inverted. Will the TD update largely affect it? Is there any theoretical basis if the DQN can survive if such type adversarial attacks are done to the training data?

Learning of DQN with noise data

Answers (1)

Related Questions