Ivan Trushin
Ivan Trushin

Reputation: 358

Is there a better way to count frequencies of getting in the intreval?

So, I have a numpy array and I want to count frequencies of elements getting in specific intervals. For example,

array = np.array([0, 1, 1, 1, 2, 3, 4, 5]) 
intervals = np.array([0., 0.5, 1., 1.5, 2., 2.5, 3., 3.5, 4., 4.5, 5.])
result = {0.5: 0.125, 1.5: 0.375, 2.5: 0.125, 3.5: 0.125, 4.5: 0.125}

I have code that works fine, but it looks messy for me

import numpy as np
from collections import Counter

def freqs(arr):
    #defining our intervals
    intervals = np.arange(round(np.min(arr)), round(np.max(arr))+0.5, 0.5)
    frequency = list()

    #going through every number in array, if smaller then interval's value, appending interval's value
    for arr_i in arr:
        for intr_j in intervals:
            if arr_i < intr_j:
                frequency.append(intr_j)
                break

    #counting intervals' values
    dic = dict(Counter(frequency))
    #divide dic's values by lenghth of an array
    freqs = dict(zip(list(dic.keys()), (np.array(list(dic.values())))/len(arr)))

    return freqs

The part I dont like is where we're dividing dictionary's values by length of an array and with use of a lot of constructions we declare the new dictionary. But everything we did is just divide values by certain number.

Upvotes: 2

Views: 379

Answers (3)

Mykola Zotko
Mykola Zotko

Reputation: 17824

You can use:

arr = np.logical_and(intervals[:-1:2] <= array[:,None],
                     array[:,None] < intervals[1::2])
dict(zip(intervals[1::2], arr.sum(axis=0) / len(array)))

Output:

{0.5: 0.125, 1.5: 0.375, 2.5: 0.125, 3.5: 0.125, 4.5: 0.125}

Upvotes: 1

abhilb
abhilb

Reputation: 5757

Improving upon the answer from @YOLO

>>> c, b = np.histogram(array, bins=intervals)
>>> {i:j for i,j in zip(b[1::2], c[0::2]/len(array))}
{0.5: 0.125, 1.5: 0.375, 2.5: 0.125, 3.5: 0.125, 4.5: 0.125}

Upvotes: 1

YOLO
YOLO

Reputation: 21719

I could get the same result as your using np.histogram function.

result, _ = np.histogram(array, bins=intervals)
result = result / len(array)
filter_result = result[np.where(result > 0)]
print(filter_result)

[0.125 0.375 0.125 0.125 0.125 0.125]

Hope this gives you some idea.

Upvotes: 2

Related Questions