Reputation: 91
I compared the results, obtained via model.evaluate(...)
and the ones via numpy. As you can see, they differ a lot. The kernel has just been restarted. Cannot find where the problem is.
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
import keras.backend as K
X = np.random.rand(10000)
Y = X + np.random.rand(10000) / 5
X_train, X_valid = X[:8000], X[8000:]
Y_train, Y_valid = Y[:8000], Y[8000:]
model = Sequential([
Dense(1, input_shape=(1,), activation='linear'),
])
model.compile('adam', 'mae')
model.fit(X_train, Y_train, epochs=1, batch_size=2000, validation_data=(X_valid, Y_valid))
print(model.evaluate(X_valid, Y_valid))
>>> 0.15643194556236267
preds = model.predict(X_valid)
np.abs(Y_valid - preds).mean()
>>> 0.34461398701699736
Versions: keras = '2.3.1', tensorflow = '2.1.0'.
Upvotes: 1
Views: 114
Reputation: 10474
This is a tricky one, but actually simple to fix:
Your targets Y_valid
have shape (2000,)
, i.e. just an array of 2000 numbers. The network outputs however, have shape (2000, 1)
. The expression Y_valid - preds
then tries to subtract a shape (2000, 1)
from a shape (2000,)
... The two are not compatible, and need to be broadcast. Standard broadcasting rules will proceed as follows:
1. Align like
( 2000,)
(2000, 1)`
2. add extra dimension in front
(1, 2000,)
(2000, 1)
3. broadcast to make compatible
(2000, 2000)
(2000, 2000)
...and so you are actually subtracting two arrays of size (2000, 2000)
from each other. You are basically computing the difference between each prediction and all targets instead of just the corresponding one. Obviously, the mean of this will be much larger.
tl; dr: model.evaluate
is correct. The manual computation is incorrect due to funny broadcasting. You can fix it by reshaping the predictions to (2000,)
(or the targets to (2000, 1)
:
preds = model.predict(X_valid)[:, 0]
np.abs(Y_valid - preds).mean()
Upvotes: 1
Reputation: 976
It's because the model.predict
output shape is not same with Y_valid
. If you get the transpose of the predictions it will give you almost same loss.
>>> Y_valid.shape
(2000,)
>>> preds.shape
(2000, 1)
>>> np.abs(Y_valid - np.transpose(preds)).mean()
Upvotes: 2