Reputation: 67
My question is I wrote the Q-learning algorithm in c++ with epsilon greedy policy now I have to plot the learning curve for the Q-values. What exactly I should have to plot because I have an 11x5 Q matrix, so should I take one Q value and plot its learning or should I have to take the whole matrix for a learning curve, could you guide me with it. Thank you
Upvotes: 0
Views: 671
Reputation: 11
For learning curves in Q-learning, it's common to print the cumulative reward per episode. Typically, you would accumulate the reward at each time step within an episode and print or visualize the cumulative reward after the episode is completed.
Upvotes: 0
Reputation: 66815
Learning curves in RL are typically plots of returns over time, not Q-losses or anything like this. So you should run your environment, compute the total reward (aka return) and plot it at a corresponding time.
Upvotes: 0