Reputation: 25
It has been proved that the Q-Learning algorithm converges to the Qs of the optimal policy which are unique. So is it correct to conclude that the Q-Learning algorithm cannot become overtrained?
Upvotes: 1
Views: 432
Reputation: 66815
There is no concept of overtraining in the world where you assume that you have infinite access to whole data (which Q-learning assumes). If you do not use "pure" Q-learning, which is state-space based, but instead some approximators like Deep Q-learning, this can overtrain heavily. Lack of this properpty comes from unrealistic assumptions, which are usually not met (unless your problem is extremely simple/small).
Upvotes: 3