bonbon
bonbon

Reputation: 121

How to count word dictionary in Python?

I have a question about dictionary handling in Python.

I wonder how to compare the dictionary with the list of words.

[example of my input]

text = ['fall', 'big', 'pocket', 'layer', 'park', ...]
mylist = ['spring', 'summer', 'fall', 'winter', 'school', ...]

[make dictionary code]

lst_dic = {mylist : 0 for mylist in mylist }
lst_dic

I want to know how many words matching in mylist from text.

[fail code]

lst_dic = {}
for w in mylist:
    if w in text_1_1:
        lst_dic[w] += 1
    else:
        lst_dic[w] = 0

As a result, I want to know dictionary key and value. Like this:

{ 'spring':0,
  'summer':3,
  'fall':1,
  'winter':0,
  ...
}

Then, I want to extract more than 10 count value of attributes.

Please take a look at this issue.

Upvotes: 0

Views: 67

Answers (3)

mhhabib
mhhabib

Reputation: 3121

Assume a defaultdict as lst_dic. Now, Compare the dictionary key to text. if matches then count+1. Now extract the final dictionary if the value is greater than 10.

from collections import defaultdict
text = ['fall', 'big', 'pocket', 'layer', 'park']
mylist = ['spring', 'summer', 'fall', 'winter', 'school']
lst_dic = defaultdict(int)

for w in mylist:
    if w in text:
        lst_dic[w] += 1
result = {key: value for key, value in lst_dic.items() if value > 10}
print(result)

Upvotes: 2

Mehdi
Mehdi

Reputation: 1187

To avoid using many loops, I prefer to follow this approach:

1- you count all words in text using Counter().

from collections import Counter
text_count = dict(Counter(text))

2- Create an empty dictionary, mylist_count, iterate through mylist, match it with text_count key and set mylist_count values.

mylist_count = dict()
for el in mylist:
    try:
        mylist_count[el] = text_count[el]
    except:
        mylist_count[el]=0

Upvotes: 1

costaparas
costaparas

Reputation: 5237

Your code isn't initializing the lst_dic correctly in the case of the first time a key (word) occurs in the mylist. Hence, you get a KeyError.

Use collections.defaultdict to initialize the dictionary instead. This allows you to remove the else branch from your code, and merely increment each time you encounter the frequency a word in text.

import collections

text = ['fall', 'big', 'pocket', 'layer', 'park']
mylist = ['spring', 'summer', 'fall', 'winter', 'school']

lst_dic = collections.defaultdict(int)
for w in mylist:
    if w in text:
        lst_dic[w] += 1

# Show the counts of all `text` words occurring in `mylist`:
print(dict(lst_dic))

# Extract those with counts > 10:
print([e for e in lst_dic if lst_dic[e] > 10])

# Or, if you want it as a dictionary:
print({e: lst_dic[e] for e in lst_dic if lst_dic[e] > 10})

Upvotes: 2

Related Questions