Different results between training and loading autokeras-model

Question

I trained a regression-model with autokeras resulting in a model with a MAE of 0.2 with that code, where x and y were input and output-dataframes:

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
search = StructuredDataRegressor(max_trials=1000, loss='mean_squared_error', max_model_size=100000000000, overwrite = True)
search.fit(x=X_train, y=y_train, verbose=2, validation_data=(X_test, y_test))
model = search.export_model()
model.summary()
model.save('model_best')

Refeeding my data to the model delivers a MAE of about 30 with pretty nonsense predictions. My test-output values are in the range of 3 to 10, predicted output-values are in the range of -10 to 5.

model = load_model("model_best2", custom_objects=ak.CUSTOM_OBJECTS)
mae, _ = model.evaluate(x, y, verbose=2)
print('MAE: %.3f' % mae)

Those results are reproducible with any provided model from autokeras. Do you have any clue why training and evaluation results are totally different?

I created a minimal example which is delivering similar bad results so you can try on your own:

from numpy import asarray
from pandas import read_csv
from sklearn.model_selection import train_test_split
from autokeras import StructuredDataRegressor
import matplotlib.pyplot as plt
# load dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/auto-insurance.csv'
dataframe = read_csv(url, header=None)
data = dataframe.values
data = data.astype('float32')
X, y = data[:, :-1], data[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=1)

search = StructuredDataRegressor(max_trials=15, loss='mean_absolute_error')
search.fit(x=X_train, y=y_train, verbose=2)

mae, _ = search.evaluate(X_test, y_test, verbose=2)
print('MAE: %.3f' % mae)

predictions = search.predict(X)

miny = float(y.min())
maxy = float(y.max())
minp = float(min(predictions))
maxp = float(max(predictions))
plt.figure(figsize=(15,15))
plt.scatter(y, predictions, c='crimson',s=5)
p1 = max(maxp, maxy)
p2 = min(minp, miny)
plt.plot([p1, 0], [p1, 0], 'b-')
plt.xlabel('True Values', fontsize=15)
plt.ylabel('Predictions', fontsize=15)
plt.axis('equal')
plt.show()

Different results between training and loading autokeras-model

Answers (0)

Related Questions