Reputation: 1070
I have a list of lists:
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
I would like to find the frequency of sub-lists in the above list.
I have tried to use itertools:
freq = [len(list(group)) for x in countall for key, group in groupby(x)]
However, I am getting the wrong results:
[1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 2, 1]
What is wrong with my list comprehension?
Upvotes: 1
Views: 1434
Reputation: 18217
Groupby seems to deal with sequences that come after each other. To use it you would need to sort the list first. Another option is to use the Counter class:
from collections import Counter
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
Counter([tuple(x) for x in countall])
Output:
Counter({(3, 2): 10, (2, 3): 10, (1, 4): 5, (4, 1): 5, (5, 0): 1, (0, 5): 1})
Upvotes: 4
Reputation: 14001
as pointed by ForceBru first sort your list then use groupby:
from itertools import groupby
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4], [3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
freq = [(key, len(list(x))) for key, x in groupby(sorted(countall))]
print(freq)
output:
[([0, 5], 1), ([1, 4], 5), ([2, 3], 10), ([3, 2], 10), ([4, 1], 5), ([5, 0], 1)]
your code has bugs:
freq = [len(list(group)) for x in countall for key, group in groupby(x)]
^paranthesis missing
Then you are grouping each individual list in countall
which is not needed.
for x in countall for key, group in groupby(x)
yo can directly groupby
on sorted(countall)
Also, as answered by @Bemmu you can use collections.Counter. But that does not support list
so first you will have to convert your data to tupple or string then use Counter
Upvotes: 3
Reputation: 49794
As noted in comments you will need to sort if you are using groupby.
Code:
import itertools as it
freq = {tuple(key): len(list(group)) for key, group in it.groupby(sorted(countall))}
Test Code:
countall = [[5, 0], [4, 1], [4, 1], [3, 2], [4, 1], [3, 2], [3, 2], [2, 3],
[4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4],
[4, 1], [3, 2], [3, 2], [2, 3], [3, 2], [2, 3], [2, 3], [1, 4],
[3, 2], [2, 3], [2, 3], [1, 4], [2, 3], [1, 4], [1, 4], [0, 5]]
print(freq)
Results:
{(3, 2): 10, (1, 4): 5, (2, 3): 10, (5, 0): 1, (0, 5): 1, (4, 1): 5}
Upvotes: 1