Reputation: 5182
Hi I have a list of values. I want to get another list with the amount of times every values in that list occurs. This is fairly easy, but I also need to have the values which are not present in the original list, to be present in the frequency list, but then with value 0. For example:
I = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
What you expect:
freqI = [1,2,3,2,2,2,3,3]
What I need:
freqI = [1,2,3,0,2,2,3,0,3]
As you can see 3 and 7 are not present in I, though they are still accounted for in the frequency list.
My initial try ended up giving me the first kind of solution (with the intermediate values):
d = {x:I.count(x) for x in I}
sorted_x = sorted(d.iteritems(), key=operator.itemgetter(0))
How can I get the frequency count (aka histogram) of my array, with the intermediate values present ?
Upvotes: 4
Views: 3145
Reputation: 133554
Your list looks like it's in sorted order, if so this is the best way:
>>> from collections import Counter
>>> I = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
>>> c = Counter(I)
>>> [c[i] for i in range(I[0], I[-1]+1)]
[1, 2, 3, 0, 2, 2, 3, 0, 3]
Upvotes: 2
Reputation: 250951
>>> lis = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
>>> maxx,minn = max(lis),min(lis)
>>> from collections import Counter
>>> c = Counter(lis)
>>> [c[i] for i in xrange(minn,maxx+1)]
[1, 2, 3, 0, 2, 2, 3, 0, 3]
or as suggested by @DSM we can get min
and max
from the dict
itself:
>>> [c[i] for i in xrange( min(c) , max(c)+1)]
[1, 2, 3, 0, 2, 2, 3, 0, 3]
Upvotes: 8
Reputation: 137574
How about
>>> I = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
>>> from collections import Counter
>>> frequencies = Counter(I)
>>> frequencies
Counter({2: 3, 6: 3, 8: 3, 1: 2, 4: 2, 5: 2, 0: 1})
You can query the counter for any number. For numbers it hasn't seen, it gives 0
>>> frequencies[42]
0
Upvotes: 5
Reputation: 17168
You're close; you just want to iterate over a generator instead of your actual list. Something like this:
# note: lowercase variable names are Python standard and good coding practice!
d = {n:list_of_ints.count(n) for n in range(max(list_of_ints))}
Note that I'm using max(I)
, which is just the biggest element in your list, since you didn't specify an upper bound. Obviously you could hardcode this number instead, or if you want to restrict your histogram to the range of data in I
, you can make it range(min(I), max(I))
.
Upvotes: 1