Reputation: 11
I'm doing Logistic Regression with sklearn to predict some categories given some descriptions. Here's the code at the moment
X_Train, X_Test, y_train, y_test = train_test_split(df['description'], df['category'])
count_vect = CountVectorizer()
X_Train_counts = count_vect.fit_transform(X_Train)
tfidf_transformer = TfidfTransformer()
X_Train_tfidf = tfidf_transformer.fit_transform(X_Train_counts)
# Fit the logistic regression model
clf = LogisticRegression(random_state=0, class_weight='balanced', solver='lbfgs', max_iter=1000)
clf.fit(X_Train_tfidf, y_train)
To also manually check my predictions I do this
# Make predictions
predictions = clf.predict(tfidf_transformer.transform(count_vect.transform(X_Test)))
print(X_Test.iloc[7])
print(predictions[7])
My question is how can I make a prediction of a category
by manually giving a custom description
from outside the testing data (e.g. I manually input it)
If that's possible, is there also a way to get the top n predictions categories for that custom text?
Upvotes: 1
Views: 528
Reputation: 3851
You should be able to do something like this:
your_description = "some text"
vectorized = tfidf_transformer.transform(count_vect.transform([your_description]))
predictions = clf.predict(vectorized.reshape(-1, 1))
Upvotes: 1