Reputation: 3099
I've read that you can't do cross-validation with Keras when you also want to use model callbacks, but then this post showed that it was possible after all. However, I'm having a hard time incorporating this in my context.
To explore this in more detail, I am following the machinelearningmastery blog, and using the iris dataset.
This is a three-class classification problem, and I'm attempting to use a multilayer perceptron (one layer for now, for testing). My goal right now is to work in model callbacks so I can save the weights of the best model. Below, I attempt that in my section network_mlp
. To show that the model works without callbacks, I also include network_mlp_no_callbacks
.
You should be able to copy/paste this into a python session and run it, no problem. To replicate the error I'm seeing, uncomment the last line.
Error: RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x7f7e1c9d2290>, as the constructor does not seem to set parameter callbacks
Code: first section reads in the data; second is the model with callbacks, which is not working; third is the model without callbacks, which works (to provide context).
#!/usr/bin/env python
import numpy as np
import pandas, math, sys, keras
from keras.models import Sequential
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from keras.utils import np_utils
from keras.utils.np_utils import to_categorical
from sklearn.preprocessing import LabelEncoder
def read_data_mlp(train_file):
train_data = pandas.read_csv("iris.csv", header=None)
train_data = train_data.values
X = train_data[:,0:4].astype(float)
Y = train_data[:,4]
X = X.astype('float32')
scaler = MinMaxScaler(feature_range=(0, 1))
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
X_train_s = scaler.fit_transform(X)
return (X_train_s, dummy_y)
def network_mlp(X, Y, out_dim=10, b_size=30, num_classes=3, epochs=10):
#out_dim is the dimensionality of the hidden layer;
#b_size is the batch size. There are 150 examples total.
filepath="weights_mlp.hdf5"
def mlp_model():
model = Sequential()
model.add(Dense(out_dim, input_dim=4, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint]
estimator = KerasClassifier(build_fn=mlp_model, epochs=epochs, batch_size=b_size, verbose=0, callbacks=callbacks_list)
kfold = KFold(n_splits=10, shuffle=True, random_state=7)
results = cross_val_score(estimator, X, Y, cv=kfold)
print("MLP: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
return 0
def network_mlp_no_callbacks(X, Y, out_dim=10, b_size=30, num_classes=3, epochs=10):
def mlp_model():
model = Sequential()
model.add(Dense(out_dim, input_dim=4, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
estimator = KerasClassifier(build_fn=mlp_model, epochs=epochs, batch_size=b_size, verbose=0)
kfold = KFold(n_splits=10, shuffle=True, random_state=7)
results = cross_val_score(estimator, X, Y, cv=kfold)
print("MLP: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
return 0
if __name__=='__main__':
X, Y = read_data_mlp('iris.csv')
network_mlp_no_callbacks(X, Y, out_dim=10, b_size=30, num_classes=3, epochs = 10)
#network_mlp(X, Y, out_dim=10, b_size=30, num_classes=3, epochs = 10)
QUESTION: How can I incorporate model callbacks into KerasClassifier?
Upvotes: 3
Views: 2947
Reputation: 741
The solution is fairly close to the other answer you referenced, but slightly different because they are using several estimators and you have only one. I was able to get checkpointing working by adding fit_params={'callbacks': callbacks_list}
to the cross_val_score
call, removing the callback list from the estimator
initialization, and changing save_best_only
to False
.
So now the subsection of code in network_mlp
looks like this:
checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=False, mode='max')
callbacks_list = [checkpoint]
estimator = KerasClassifier(build_fn=mlp_model, epochs=epochs, batch_size=b_size, verbose=0)
kfold = KFold(n_splits=10, shuffle=True, random_state=7)
results = cross_val_score(estimator, X, Y, cv=kfold, fit_params={'callbacks': callbacks_list})
save_best_only=False
is necessary because you don't have a validation split set up for the neural network, and thus val_acc
is unavailable. If you want to use a validation sub-split, you can for example change the estimator initialization to:
estimator = KerasClassifier(build_fn=mlp_model, epochs=epochs, batch_size=b_size, verbose=0, validation_split=.25)
Good luck!
Upvotes: 2