Count frequency of words in multiple lists from a larger vocabulary?

Question

I know how to count frequency of elements in a list but here's a lightly different question. I have a larger set of vocabulary and a few lists that only use part of the total vocabulary. Using numbers instead of words as an example:

vocab=[1,2,3,4,5,6,7]
list1=[1,2,3,4]
list2=[2,3,4,5,6,6,7]
list3=[3,2,4,4,1]

and I want the output to keep "0"s when a word is not used:

count1=[1,1,1,1,0,0,0]
count2=[0,1,1,1,1,2,1]
count3=[1,1,1,2,0,0,0]

I guess I need to sort the words, but how do I keep the "0" records?

cs95 · Accepted Answer

This can be done using the list object's inbuilt count function, within a list comprehension.

>>> vocab = [1, 2, 3, 4, 5, 6, 7]
>>> list1 = [1, 2, 3, 4]
>>> list2 = [2, 3, 4, 5, 6, 6, 7]
>>> list3 = [3, 2, 4, 4, 1]
>>> [list1.count(v) for v in vocab]
[1, 1, 1, 1, 0, 0, 0] 
>>> [list2.count(v) for v in vocab]
[0, 1, 1, 1, 1, 2, 1]
>>> [list3.count(v) for v in vocab]
[1, 1, 1, 2, 0, 0, 0]

Iterate over each value in vocab, accumulating the frequencies for them.

Count frequency of words in multiple lists from a larger vocabulary?

Answers (2)

Related Questions