Reputation: 2439
When storing the classifier trained with sklearn I have a choice between pickle (or cPickle) and joblib.dump().
Is there any benefits apart from performance to using joblib.dump()? Can a classifier saved by pickle produce worse results than the one saved with joblib?
Upvotes: 4
Views: 6354
Reputation: 2128
joblib works especially well with NumPy arrays which are used by sklearn so depending on the classifier type you use you might have performance and size benefits using joblib.
Otherwise pickle does work correctly so saving a trained classifier and loading it again will produce the same results no matter which of the serialization libraries you use. See also the docs of sklearn on this topic.
Please note that joblib is included in sklearn.
Upvotes: 2
Reputation: 2481
They actually use the same protocol (i.e. joblib uses pickle). Check out the documentation for joblib.dump
- you can specify the level of pickle compression using arguments to joblib.
Upvotes: 3