Nifty
Nifty

Reputation: 67

Learning Curve in Q-learning

My question is I wrote the Q-learning algorithm in c++ with epsilon greedy policy now I have to plot the learning curve for the Q-values. What exactly I should have to plot because I have an 11x5 Q matrix, so should I take one Q value and plot its learning or should I have to take the whole matrix for a learning curve, could you guide me with it. Thank you

Upvotes: 0

Views: 671

Answers (2)

Zahra Safari-d
Zahra Safari-d

Reputation: 11

For learning curves in Q-learning, it's common to print the cumulative reward per episode. Typically, you would accumulate the reward at each time step within an episode and print or visualize the cumulative reward after the episode is completed.

Upvotes: 0

lejlot
lejlot

Reputation: 66815

Learning curves in RL are typically plots of returns over time, not Q-losses or anything like this. So you should run your environment, compute the total reward (aka return) and plot it at a corresponding time.

Upvotes: 0

Related Questions