frazman
frazman

Reputation: 33223

What is the cleanest way to count frequencies in python

I have data as following [0.1,0.2,1,5,100] and so on... What i want to do is count number of items between

1-10
11-20
21-30
... and so on...

Right now, I have a very messy code..

What I have done is mapped

1-10 :=> 0
11-20:=> 1
..and on..

So I have defined buckets where bucket 0 has range 1-10, bucket 1 has range 11-20 and so on.

And the code is:

for ele in data:
    bucket_id = get_bucket_id(ele)
    freq_dict[bucket_id] +=1

get_bucket_id is a big if else code..

Is there a better way to do this?

Upvotes: 2

Views: 2371

Answers (4)

mdml
mdml

Reputation: 22882

You could use numpy.histogram, which tabulates the frequencies at which elements in your data appear in a set of intervals (bins). It returns the counts in each bin and the rightmost edge of each bin:

>>> import numpy as np
>>> data = [0.1,0.2,1,5,100]
>>> hist, bin_edges = np.histogram( data )
>>> hist
array([4, 0, 0, 0, 0, 0, 0, 0, 0, 1])
>>> bin_edges
array([   0.1 ,   10.09,   20.08,   30.07,   40.06,   50.05,   60.04,
         70.03,   80.02,   90.01,  100.  ])

Upvotes: 5

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 250891

You can use collections.Counter and bisect module here:

>>> from bisect import bisect_left
>>> lis = range(0, 101, 10)
>>> l = [0.1, 0.2, 1, 5, 100, 11]
>>> c = Counter(bisect_left(lis, item) for item in l)
>>> c
Counter({1: 4, 10: 1, 2: 1})
>>> [c[i] for i in xrange(1, 11)]
[4, 1, 0, 0, 0, 0, 0, 0, 0, 1]

Upvotes: 1

jazzpi
jazzpi

Reputation: 1429

You could use len and filter:

c = []
for l, u in [(1, 10), (11, 20), (21, 30)]: # ...
    c.append(len(filter(lambda x: l <= x <= u, values)))

Upvotes: 1

Fred Foo
Fred Foo

Reputation: 363517

Use a Counter and compute the bucket using integer division.

from collections import Counter

freq = Counter()
for x in data:
    freq[(x - 1) // 10] += 1

Note that this maps values less than one to -1. When dealing with not-strictly positive data, you'll actually want to use ranges 1-9, 10-19, etc.

Upvotes: 6

Related Questions