Saad Touhbi
Saad Touhbi

Reputation: 345

QLearning usage on a repetitive simulation

I am using Q-Learning algorithm on a simulation. this simulation has limited iterations (600 to 700). the learning process is activated for several runs of this simulation (100 run). I am new to reinforcement learning, and i have an issue here about how to use exploration/exploitation on such kind of simulation (I am using e-greedy exploration). I am using a decreasing exploration and I am wondering if I should use the decreasing exploration on the whole simulation runs, or decrease it for each simulation run (initiate epsilon to 0.9 for each simulation run and then decrease it). Thank you

Upvotes: 1

Views: 105

Answers (1)

Tjorriemorrie
Tjorriemorrie

Reputation: 17282

You won’t need such a high initiation of the epsilon. It might be better to initialize the q-values as very high, so that unknown q-values are always picked above q-values that has been explored at least once.

Considering your state space, it doesn’t matter whether you decrease it after a whole run or an individual run, but individually sounds like a better option.

How fast you decrease it will also depend on the circumstances of the world and how fast the agent learns. I’m trying to make my alpha and epsilon correlate to the error, but it’s tricky to do that.

Upvotes: 1

Related Questions