alfa_80
alfa_80

Reputation: 427

Ways to utilize policy learned in reinforcement learning

I would like to cross check my understanding on reinforcement learning. How easy/difficult or common to train a policy and then reuse the learned policy later on? What I understood so far is that when we stop the training and if we would again start, it would need start from scratch i.e. not able to benefit from the learned policy. Thank you.

Upvotes: 2

Views: 135

Answers (1)

R.F. Nelson
R.F. Nelson

Reputation: 2312

It depends what specific method you are using but generally, once a learning method converges, there is no need to “train”. In the case of Q-learning, for example, which is a model-free off-policy approach to learning, before the algorithm converges the agent must still take random actions to ensure every relevant point in the Q(s,a) space has been explored. But each individual step takes advantage of the experience gained from prior episodes, so to say that you start from scratch each episode would be incorrect.

Upvotes: 2

Related Questions