Monica
Monica

Reputation: 1070

counting frequency in list

I have a list of lists:

countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]

I would like to find the frequency of sub-lists in the above list.

I have tried to use itertools:

freq = [len(list(group)) for x in countall for key, group in groupby(x)]

However, I am getting the wrong results:

[1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1]

What is wrong with my list comprehension?

Upvotes: 1

Views: 1434

Answers (3)

Bemmu
Bemmu

Reputation: 18217

Groupby seems to deal with sequences that come after each other. To use it you would need to sort the list first. Another option is to use the Counter class:

from collections import Counter
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
Counter([tuple(x) for x in countall])

Output:

Counter({(3, 2): 10, (2, 3): 10, (1, 4): 5, (4, 1): 5, (5, 0): 1, (0, 5): 1})

Upvotes: 4

Vikash Singh
Vikash Singh

Reputation: 14001

as pointed by ForceBru first sort your list then use groupby:

from itertools import groupby
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]

freq = [(key, len(list(x))) for key, x in groupby(sorted(countall))]
print(freq)

output:

[([0, 5], 1), ([1, 4], 5), ([2, 3], 10), ([3, 2], 10), ([4, 1], 5), ([5, 0], 1)]

your code has bugs:

freq = [len(list(group)) for x in countall for key, group in groupby(x)]
                       ^paranthesis missing

Then you are grouping each individual list in countall which is not needed.

for x in countall for key, group in groupby(x)

yo can directly groupby on sorted(countall)

Also, as answered by @Bemmu you can use collections.Counter. But that does not support list so first you will have to convert your data to tupple or string then use Counter

Upvotes: 3

Stephen Rauch
Stephen Rauch

Reputation: 49794

As noted in comments you will need to sort if you are using groupby.

Code:

import itertools as it
freq = {tuple(key): len(list(group)) for key, group in it.groupby(sorted(countall))}

Test Code:

countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3],
           [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4],
           [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4],
           [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]

print(freq)

Results:

{(3, 2): 10, (1, 4): 5, (2, 3): 10, (5, 0): 1, (0, 5): 1, (4, 1): 5}

Upvotes: 1

Related Questions