Reputation: 1521
I training a random forest model to predict title cluster. The issue is running in notebook, the predicted cluster is correct. But when uploading random forest model to the flask, the predicted becomes same for all input. Would you like to give some suggestions? Thanks.
feature_dim = 2 ** 10
vectorizer = TfidfVectorizer(max_features=feature_dim)
vectorizer.fit_transform(df['text'].values)
text = df['text'].values
X = vectorizer.fit_transform(text)
rf_model = RandomForestClassifier(n_estimators=100)
rf_model.fit(X1_train, y1_train)
pickle.dump(rf_model, open('rf_model.sav', 'wb'))
rf_model = load('rf_model.sav')
titles = [
"title_1"
"title_2",
]
X_ti = vectorizer.transform(titles)
y_rf = rf_model.predict(X_ti)
print(y_rf)
Results look like: [8 8 8 8 8 8 8]
Is it caused by not dumping tfidf vector feature?
Upvotes: 0
Views: 322