B_Miner
B_Miner

Reputation: 1820

Save sklearn cross validation object

Following the tutorial for sklearn, I attempted to save an object that was created via sklearn but was unsuccessful. It appears the problem is with the cross validation object, as I can save the actual (final) model.

Given:

rf_model = RandomForestRegressor(n_estimators=1000, n_jobs=4, compute_importances = False)
cvgridsrch = GridSearchCV(estimator=rf_model, param_grid=parameters,n_jobs=4) 
cvgridsrch.fit(X,y)

This will succeed:

joblib.dump(cvgridsrch.best_estimator_, 'C:\\Users\\Desktop\\DMA\\cvgridsrch.pkl', compress=9)

and this will fail:

joblib.dump(cvgridsrch, 'C:\\Users\\Desktop\\DMA\\cvgridsrch.pkl', compress=9)

with error:

PicklingError: Can't pickle <type 'instancemethod'>: it's not found as __builtin__.instancemethod

How to save the full object?

Upvotes: 1

Views: 1362

Answers (3)

ntg
ntg

Reputation: 14075

If you are using Python 2, try:

import dill  

So that lambda functions can be pickled....

Upvotes: 1

Josefine
Josefine

Reputation: 181

I know this is an old question, but it might be useful for people coming here having the same, or similar, problem.

I'm not sure of the specific error message, but I managed to sucessfully save the entire GridSearchCV object in my own project by using pickle:

import pickle
gs = GridSearchCV(some parameters) #create the gridsearch object
gs.fit(X, y) # fit the model
with open('file_name', 'wb') as f:
    pickle.dump(gs, f) # save the object to a file

Then you can use

with open('file_name', 'rb') as f:
    gs = pickle.load(f)

to read the file and hence be able to use the object again.

Upvotes: 0

log0
log0

Reputation: 551

One possible cause could be multithreading issue, which you may refer to this stackoverflow answer.

Also, is it possible for you to dump your object not via joblib but a more fundamental method like pickle (and not even cPickle, which is more restrictive)?

Upvotes: 0

Related Questions