Olivier_s_j
Olivier_s_j

Reputation: 5182

Get frequency count of elements in an array

Hi I have a list of values. I want to get another list with the amount of times every values in that list occurs. This is fairly easy, but I also need to have the values which are not present in the original list, to be present in the frequency list, but then with value 0. For example:

I = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]

What you expect:

freqI = [1,2,3,2,2,2,3,3]

What I need:

freqI = [1,2,3,0,2,2,3,0,3]

As you can see 3 and 7 are not present in I, though they are still accounted for in the frequency list.

My initial try ended up giving me the first kind of solution (with the intermediate values):

d = {x:I.count(x) for x in I}

sorted_x = sorted(d.iteritems(), key=operator.itemgetter(0))

How can I get the frequency count (aka histogram) of my array, with the intermediate values present ?

Upvotes: 4

Views: 3145

Answers (5)

YXD
YXD

Reputation: 32511

[I.count(k) for k in range(max(I+1))]

Upvotes: 2

jamylak
jamylak

Reputation: 133554

Your list looks like it's in sorted order, if so this is the best way:

>>> from collections import Counter
>>> I = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
>>> c = Counter(I)
>>> [c[i] for i in range(I[0], I[-1]+1)]
[1, 2, 3, 0, 2, 2, 3, 0, 3]

Upvotes: 2

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250951

>>> lis = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
>>> maxx,minn = max(lis),min(lis)
>>> from collections import Counter
>>> c = Counter(lis)
>>> [c[i] for i in xrange(minn,maxx+1)]
[1, 2, 3, 0, 2, 2, 3, 0, 3]

or as suggested by @DSM we can get min and max from the dict itself:

>>> [c[i] for i in xrange( min(c) , max(c)+1)]
[1, 2, 3, 0, 2, 2, 3, 0, 3]

Upvotes: 8

Colonel Panic
Colonel Panic

Reputation: 137574

How about

>>> I = [0,1,1,2,2,2,4,4,5,5,6,6,6,8,8,8]
>>> from collections import Counter
>>> frequencies = Counter(I)
>>> frequencies
Counter({2: 3, 6: 3, 8: 3, 1: 2, 4: 2, 5: 2, 0: 1})

You can query the counter for any number. For numbers it hasn't seen, it gives 0

>>> frequencies[42]
0

Upvotes: 5

Henry Keiter
Henry Keiter

Reputation: 17168

You're close; you just want to iterate over a generator instead of your actual list. Something like this:

# note: lowercase variable names are Python standard and good coding practice!
d = {n:list_of_ints.count(n) for n in range(max(list_of_ints))} 

Note that I'm using max(I), which is just the biggest element in your list, since you didn't specify an upper bound. Obviously you could hardcode this number instead, or if you want to restrict your histogram to the range of data in I, you can make it range(min(I), max(I)).

Upvotes: 1

Related Questions