Ivan Bilan
Ivan Bilan

Reputation: 2439

Difference between saving a classifier with pickle and joblib.dump?

When storing the classifier trained with sklearn I have a choice between pickle (or cPickle) and joblib.dump().

Is there any benefits apart from performance to using joblib.dump()? Can a classifier saved by pickle produce worse results than the one saved with joblib?

Upvotes: 4

Views: 6354

Answers (2)

Maccesch
Maccesch

Reputation: 2128

joblib works especially well with NumPy arrays which are used by sklearn so depending on the classifier type you use you might have performance and size benefits using joblib.

Otherwise pickle does work correctly so saving a trained classifier and loading it again will produce the same results no matter which of the serialization libraries you use. See also the docs of sklearn on this topic.

Please note that joblib is included in sklearn.

Upvotes: 2

mprat
mprat

Reputation: 2481

They actually use the same protocol (i.e. joblib uses pickle). Check out the documentation for joblib.dump - you can specify the level of pickle compression using arguments to joblib.

Upvotes: 3

Related Questions