Farheen Nilofer
Farheen Nilofer

Reputation: 171

AttributeError: getfeature_names not found ; using scikit-learn

from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer()
vectorizer = vectorizer.fit(word_data)
freq_term_mat = vectorizer.transform(word_data)

from sklearn.feature_extraction.text import TfidfTransformer

tfidf = TfidfTransformer(norm="l2")
tfidf = tfidf.fit(freq_term_mat)
Ttf_idf_matrix = tfidf.transform(freq_term_mat)

voc_words = Ttf_idf_matrix.getfeature_names()
print "The num of words = ",len(voc_words)

when I run the program containing this piece of code I get following error:

Traceback (most recent call last): File "vectorize_text.py", line 87, in
voc_words = Ttf_idf_matrix.getfeature_names()
File "/home/farheen/anaconda/lib/python2.7/site- >packages/scipy/sparse/base.py", line 499, in getattr
raise AttributeError(attr + " not found")
AttributeError: get_feature_names not found

Please suggest me a solution for it.

Upvotes: 4

Views: 17216

Answers (4)

Sadullah_math
Sadullah_math

Reputation: 11

You should change get_feature_names() to get_feature_names_out().

Upvotes: 0

user6275647
user6275647

Reputation: 381

I see two problems with your code. First, you are applying get_feature_names() to your matrix output, rather than to the vectorizer. You need to apply it to the vectorizer. Second, you are unnecessarily breaking this apart into too many steps. You can use TfidfVectorizer.fit_transform() to do what you want in much less space. Try this:

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer()
transformed = vectorizer.fit_transform(word_data)
print "Num words:", len(vectorizer.get_feature_names())

Upvotes: 10

mac
mac

Reputation: 177

from sklearn.feature_extraction.text import TfidfVectorizer
TfIdfer = TfidfVectorizer(stop_words = 'english')
TfIdfer.fit_transform(word_data).toarray()
names = TfIdfer.get_feature_names()

Upvotes: 2

Maelstrom
Maelstrom

Reputation: 453

Is it not get_feature_names(), ie. with an underscore after 'get'.

Also, I am not sure what you are trying to do, but get_feature_names is a method valid only for *Vectorizer classes, not with the TfidTransformer. Maybe you want TfidVectorizer instead?

Upvotes: 2

Related Questions