Counting elements of list with Counter

Question

I'm new to python and programming and need your help.

I'm trying to count most common words in a text using nltk.word_tokenize and Counter. When I get the list of all elements of the text and I want to count all of them, Counter counts only letters.

This is the code:

from nltk.tokenize import word_tokenize

word_counter = Counter()

test3 = "hello, hello, how are you? It's me - Boris"
words = word_tokenize(test3)
print(words)
['hello', ',', 'hello', ',', 'how', 'are', 'you', '?', 'It', "'s", 'me', '-', 'Boris']

for word in words:
    word_counter.update(word)
print(word_counter)

The output:

Counter({'o': 5, 'e': 4, 'l': 4, 'h': 3, ',': 2, 'r': 2, 's': 2, 'w': 1, 'a': 1, 'y': 1, 'u': 1, '?': 1, 'I': 1, 't': 1, "'": 1, 'm': 1, '-': 1, 'B': 1, 'i': 1})

How could I solve that? I look through some topics, they solve it with text.split() but it is not so precise as nltk.

Thank you!

user2390182 · Accepted Answer

Just use Counter as follows:

word_counter = Counter(words)

Counter.update takes an iterable and updates the counts for the elements the iterable produces. In your loop, that would be the letters of the word (remember strings are iterables). If you were to use update, you could do:

word_counter = Counter()
# ...
words = word_tokenize(test3)
word_counter.update(words)

But there is no need to separate the initialization of the counter and the actual counting unless you want to repeat the second step for multiple lists of words.

Counting elements of list with Counter

Answers (1)

Related Questions