Reputation: 127
I was trying to use a dictionary to count word frequency on a given string. Say:
s = 'I ate an apple a big apple'
I understand the best way to count word frequency is probably to use collections.Counter
. But I want to know if I can solve this by using a dictionary comprehension.
My original method(without dictionary comprehension) was
dict = {}
for token in s.split(" "):
dict[token] = dict.get(token, 0) + 1
and it works fine:
dict
{'I': 1, 'a': 1, 'an': 1, 'apple': 2, 'ate': 1, 'big': 1}
I tried to use a dictionary comprehension to this, like
dict = {}
dict = {token: dict.get(token, 0) + 1 for token in s.split(" ")}
But this didn't work.
dict
{'I': 1, 'a': 1, 'an': 1, 'apple': 1, 'ate': 1, 'big': 1}
What's wrong with the dictionary comprehension? Is it because I used itself inside the comprehension so every time I called dict.get('apple', 0
) in the comprehension, I will get 0
? However, I don't know how to test this so I am not 100% sure.
P.S. If it makes any difference, I am using python 3.
Upvotes: 2
Views: 2401
Reputation: 7815
For your dictionary comprehension to work, you need a reference to the comprehension inside itself. Something like this would work
{token: __me__.get(token, 0) + 1 for token in s.split(" ")}
if there were such thing as '__me__
' referencing the comprehension being built. In Python 3 there is no a documented way to do this.
According to this answer, an undocumented "implementation artifact" (on which Python users should not rely) can be used in Python 2.5, 2.6 to write self-referencing list comprehension. Maybe a similar hack exists for dictionary comprehensions in Python 3 too.
Upvotes: 1
Reputation: 7100
You could also use list.count()
, as:
s = 'I ate an apple a big apple'
print {token: s.split().count(token) for token in set(s.split())}
Upvotes: 1
Reputation: 599956
If you go through your code operation by operation, you will see what is wrong.
First you set dict
to an empty dict. (As mentioned in the comments, it's a bad idea to use that for your own variable name, but that's not the problem here.)
Secondly, your dict comprehension is evaluated. At this point the name dict
still refers to the empty dict. So every time you do dict.get(whatever, 0)
, it will always get the default.
Finally, your populated dict is reassigned to the name dict
, replacing the empty one that was previously there.
Upvotes: 2