Difference between saving a classifier with pickle and joblib.dump?

Question

When storing the classifier trained with sklearn I have a choice between pickle (or cPickle) and joblib.dump().

Is there any benefits apart from performance to using joblib.dump()? Can a classifier saved by pickle produce worse results than the one saved with joblib?

Maccesch · Accepted Answer

joblib works especially well with NumPy arrays which are used by sklearn so depending on the classifier type you use you might have performance and size benefits using joblib.

Otherwise pickle does work correctly so saving a trained classifier and loading it again will produce the same results no matter which of the serialization libraries you use. See also the docs of sklearn on this topic.

Please note that joblib is included in sklearn.

Difference between saving a classifier with pickle and joblib.dump?

Answers (2)

Related Questions