Reputation: 75
I need to find thresholds of bins (for ex. 0-999, 1000-1999 etc.), so that on each bin there was approximately an equal amount (1/n of total value, for ex 1/3 if we split into 3 bins).
d = {'amount': [600,400,250,340,200,500,710]}
df = pd.DataFrame(data=d)
df
amount
600
400
250
340
200
500
710
expected output if we split into 3 bins by sum of amount column:
bin sum
threshold_1(x value-x value) 1000
threshold_2(x-x) 1000
threshold_3(x-x) 1000
something like this, but i need sum value instead of count
pd.cut(amount, 3).value_counts()
maybe it could be solved in python, not only via pandas?
Upvotes: 0
Views: 920
Reputation: 862581
If need approximately an equal amount aggregate sum
with pd.cut
:
df = df.groupby(pd.cut(df.amount, 3)).sum()
print (df)
amount
amount
(199.49, 370.0] 790
(370.0, 540.0] 900
(540.0, 710.0] 1310
Upvotes: 1