Reputation: 69
I have a list of sentences
a = [['i am a testing'],['we are working on project']]
I am trying to create a word dictionary for all the sentences on the list. I tried
vectorizer = CountVectorizer()
vectorizer.fit_transform(a)
coffee_dict2 = vectorizer.vocabulary_
And i am getting an error AttributeError: 'list' object has no attribute 'lower'
The result i am expecting is a dictionary
{'i': 1, 'am': 1, 'testing': 2}
Upvotes: 1
Views: 59
Reputation: 862621
You need flatten nested lists:
from sklearn.feature_extraction.text import CountVectorizer
coffee_reviews_test = [['i am a testing'],['we are working on project']]
from itertools import chain
vectorizer = CountVectorizer()
vectorizer.fit_transform(chain.from_iterable(coffee_reviews_test))
Another solution:
vectorizer.fit_transform([x for y in coffee_reviews_test for x in y])
coffee_dict2 = vectorizer.vocabulary_
print (coffee_dict2)
{'am': 0, 'testing': 4, 'we': 5, 'are': 1, 'working': 6, 'on': 2, 'project': 3}
Upvotes: 6