Reputation: 619
I'm trying to perform parameters tuning for a neural network built with keras. This is my code with a comment on the line that causes the error:
from sklearn.cross_validation import StratifiedKFold, cross_val_score
from sklearn import grid_search
from sklearn.metrics import classification_report
import multiprocessing
from keras.models import Sequential
from keras.layers import Dense
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
from keras.wrappers.scikit_learn import KerasClassifier
import numpy as np
def tuning(X_train,Y_train,X_test,Y_test):
in_size=X_train.shape[1]
num_cores=multiprocessing.cpu_count()
model = Sequential()
model.add(Dense(in_size, input_dim=in_size, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
batch_size = [10, 20, 40, 60, 80, 100]
epochs = [10,20]
param_grid = dict(batch_size=batch_size, nb_epoch=epochs)
k_model = KerasClassifier(build_fn=model, verbose=0)
clf = grid_search.GridSearchCV(estimator=k_model, param_grid=param_grid, cv=StratifiedKFold(Y_train, n_folds=10, shuffle=True, random_state=1234),
scoring="accuracy", verbose=100, n_jobs=num_cores)
clf.fit(X_train, Y_train) #ERROR HERE
print("Best parameters set found on development set:")
print()
print(clf.best_params_)
print()
print("Grid scores on development set:")
print()
for params, mean_score, scores in clf.grid_scores_:
print("%0.3f (+/-%0.03f) for %r"
% (mean_score, scores.std() * 2, params))
print()
print("Detailed classification report:")
print()
print("The model is trained on the full development set.")
print("The scores are computed on the full evaluation set.")
print()
y_true, y_pred = Y_test, clf.predict(X_test)
print(classification_report(y_true, y_pred))
print()
And this is the errors report:
clf.fit(X_train, Y_train)
File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 804, in fit
return self._fit(X, y, ParameterGrid(self.param_grid))
File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 553, in _fit
for parameters in parameter_iterable
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 800, in __call__
while self.dispatch_one_batch(iterator):
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 658, in dispatch_one_batch
self._dispatch(tasks)
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 566, in _dispatch
job = ImmediateComputeBatch(batch)
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 180, in __init__
self.results = batch()
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 72, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 1531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/usr/local/lib/python2.7/dist-packages/keras/wrappers/scikit_learn.py", line 135, in fit
**self.filter_sk_params(self.build_fn.__call__))
TypeError: __call__() takes at least 2 arguments (1 given)
Am I missing something? The grid search goes well with random forests, svm and logistic regression. I only have problems with Neural Networks.
Upvotes: 6
Views: 11148
Reputation: 804
I hope that you've solved the problem by now.
a) I guess the problem is that you're not returning the model at the end of the wrapper function tuning()
. Use return model
b) k_model = KerasClassifier(build_fn=model, verbose=0)
I think should be build_fn=tuning
according to how you named your function.
c) The method's signature def tuning(X_train,Y_train,X_test,Y_test)
isn't correct. Instead the parameters to be passed to the function, are those that need to be replaced after every iteration (i.e. the hyperparameters you specified in param_grid
). Use def tuning(batch_size, nb_epoch)
instead.
I hope that was helpful!
Upvotes: 0
Reputation: 406
I think you maybe use scikit-learn 0.16 or earlier version.
I just had ran into same issue yesterday and after some workarounds I came to know that upgrading scikit-learn from 0.16 to 0.18 solves the issue.
clf.fit(X_train, Y_train) #SHOULD WORK with scikit-learn 0.18
One more thing that 0.18 differs from 0.16 is the GridSearchCV doesn't come up with sklearn.grid_search
but with sklearn.model_selection
Upvotes: 0
Reputation: 9099
Here the error indicates that the build_fn
needs to have 2 arguments as indicated from the # of parameters from param_grid
.
So you need to explicitly define an new function and use that as build_fn=make_model
def make_model(batch_size, nb_epoch):
model = Sequential()
model.add(Dense(in_size, input_dim=in_size, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
Also check keras/examples/mnist_sklearn_wrapper.py
where GridSearchCV
is used for hyper-parameter search.
Upvotes: 5