Reputation: 75
I applied this tutorial https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/23_Time-Series-Prediction.ipynb (on a different dataset), the turorial did not compute the mean squared error from individual output, so I added the following line in the comparison function:
mean_squared_error(signal_true,signal_pred)
but the loss and mse from the prediction were different from loss and mse from the model.evaluation on the test data. The errors from the model.evaluation (Loss, mae, mse) (test-set):
[0.013499056920409203, 0.07980187237262726, 0.013792216777801514]
the error from individual target (outputs):
Target0 0.167851388666284
Target1 0.6068108648555771
Target2 0.1710370357827747
Target3 2.747463225418181
Target4 1.7965991690103074
Target5 0.9065426398192563
I think it might a problem in training the model but i could not find where is it exactly. I would really appreciate your help.
thanks
Upvotes: 1
Views: 1400
Reputation: 6776
I had the same problem and found a solution. Hopefully this is the same problem you encountered.
It turns out that model.predict
doesn't return predictions in the same order generator.labels
does, and that is why MSE was much larger when I attempted to calculate manually (using the scikit-learn metric function).
>>> model.evaluate(valid_generator, return_dict=True)['mean_squared_error']
13.17293930053711
>>> mean_squared_error(valid_generator.labels, model.predict(valid_generator)[:,0])
91.1225401637833
My quick and dirty solution:
valid_generator.reset() # Necessary for starting from first batch
all_labels = []
all_pred = []
for i in range(len(valid_generator)): # Necessary for avoiding infinite loop
x = next(valid_generator)
pred_i = model.predict(x[0])[:,0]
labels_i = x[1]
all_labels.append(labels_i)
all_pred.append(pred_i)
print(np.shape(pred_i), np.shape(labels_i))
cat_labels = np.concatenate(all_labels)
cat_pred = np.concatenate(all_pred)
The result:
>>> mean_squared_error(cat_labels, cat_pred)
13.172956865002352
This can be done much more elegantly, but was enough for me to confirm my hypothesis of the problem and regain some sanity.
Upvotes: 0
Reputation: 1814
There are a number of reasons that you can have differences between the loss for training and evaluation.
I'm not sure exactly what problem you're running into, but it can be caused by a lot of different things and it's often difficult to debug.
Upvotes: 1