asmgx
asmgx

Reputation: 8014

How can I see TF-IDF values from tfidf_vectorizer?

I am using Python

I have this code that analyse Text documents

tfidf_vectorizer = TfidfVectorizer(max_df=0.8, max_features=10000)


# split dataset into training and validation set
xtrain, xval, ytrain, yval = train_test_split(movies_new['clean_plot'], y, test_size=0.2, random_state=9)


# create TF-IDF features
xtrain_tfidf = tfidf_vectorizer.fit_transform(xtrain)
xval_tfidf = tfidf_vectorizer.transform(xval)

I know that TF-IDF assigns a value to each word.

Is there a way that let me see what are the values of inside xtrain_tfidf ?

Upvotes: 0

Views: 521

Answers (1)

nag
nag

Reputation: 779

Here is an example

from sklearn.feature_extraction.text import TfidfVectorizer
import pandas as pd

vect = TfidfVectorizer()
tfidf_matrix = vect.fit_transform(documents)
df = pd.DataFrame(tfidf_matrix.toarray(), columns = vect.get_feature_names())
print(df)

Upvotes: 1

Related Questions