Johnnyduke
Johnnyduke

Reputation: 29

How to count the values per bin of an already generated histogram?

I want to count the values per bin and have it populate a data frame.

a = smry_dmo.loc['Mean', 'Income']
b = smry_dmo.loc['Standard Deviation', 'Income']

plt.hist(dmo_df.Income, 10, color = 'magenta', edgecolor = 'black')
plt.title(f'Distribution of Income: $\mu= {a}$, $sigma={b}$')
plt.xlabel('Income')
plt.ylabel('Frequency')
plt.show()

enter image description here

Let me know if what I'm asking isn't clear.

Thank you.

Upvotes: 0

Views: 1417

Answers (1)

Rabinzel
Rabinzel

Reputation: 7903

plt.hist returns a tuple with : (n, bins, patches). You just need to capture them so you have access afterwards.

n, bins, patches = plt.hist(dmo_df.Income, 10, color = 'magenta', edgecolor = 'black')

I made a little example to show you how it looks like.

# x = np.random.randint(1,20, size=20)
x = np.array([10, 18,  6, 13,  2, 18,  5, 13, 13,  5, 11, 18,  1,  7,  8, 10, 12, 9, 17, 2])

n, bins, patches = plt.hist(x, bins=5, color = 'magenta', edgecolor = 'black')
plt.show()

print(n)
[3. 4. 5. 4. 4.]
print(bins)
[ 1.   4.4  7.8 11.2 14.6 18. ]

Referring to this answer you can do it with numpy and get arrays of each bin with the values of your data:

binlist = np.c_[bins[:-1],bins[1:]]
d = np.array(x)
for i in range(len(binlist)):
    if i == len(binlist)-1:
        l = d[(d >= binlist[i,0]) & (d <= binlist[i,1])]
    else:
        l = d[(d >= binlist[i,0]) & (d < binlist[i,1])]
    print(l)

Output:

[2 1 2]
[6 5 5 7]
[10 11  8 10  9]
[13 13 13 12]
[18 18 18 17]

Not sure if that is a good solution but I thought if you want to have a DataFrame and just the ranges and counts you could do it like this:

df1 = pd.DataFrame({
    'bin_index' : list(range(len(n))),
    'counts': n,
    'left_bin_limit': bins[:-1],
    'right_bin_limit': bins[1:],
})

print(df1)

   bin_index  counts  left_bin_limit  right_bin_limit
0          0     3.0             1.0              4.4
1          1     4.0             4.4              7.8
2          2     5.0             7.8             11.2
3          3     4.0            11.2             14.6
4          4     4.0            14.6             18.0

Upvotes: 1

Related Questions