Yunfeng Xi
Yunfeng Xi

Reputation: 1

TfidfVectorizer does not use the whole set of words in all documents?

I am trying to build a TFIDF model with TfidfVectorizer. The feature name list namely the number of column of sparse matrix is shorter than the length of word set of documents even though I set min_df as 1. What happened?

Upvotes: 0

Views: 651

Answers (1)

jtitusj
jtitusj

Reputation: 3086

Did you check the stop_words and max_features? If you provide values in either of these two, it will exclude some words.

Upvotes: 1

Related Questions