Save a scikit-learn pipeline to a file

Question

How can I save a scikit-learn pipeline that has been trained to a local file? The official documentation says the following: https://scikit-learn.org/stable/modules/model_persistence.html

But when trying to save a pipeline, I get an error. Example:

estimators = [
    ('tfidf', TfidfVectorizer(tokenizer=lambda string: string.split(),
                             min_df=20, 
                             max_df=0.75,
                             ngram_range=(1,1))),
    ('clf', RandomForestClassifier(n_estimators=100,
                                   n_jobs=-1, 
                                   class_weight='balanced'))
]

p = Pipeline(estimators)
p.fit(x_train, y_train)

model = 'model.joblib'
joblib.dump(p, model)

However, I get the error message 'PicklingError: Can't pickle at 0x7f4c9f1e50d0>: it's not found as main.'.

How can I solve this problem?

Save a scikit-learn pipeline to a file

Answers (1)

Related Questions