Kurasao
Kurasao

Reputation: 75

Find thresholds of bins by sum of column values in pandas

I need to find thresholds of bins (for ex. 0-999, 1000-1999 etc.), so that on each bin there was approximately an equal amount (1/n of total value, for ex 1/3 if we split into 3 bins).

d = {'amount': [600,400,250,340,200,500,710]}
df = pd.DataFrame(data=d)
df

amount
600
400
250
340
200
500
710

expected output if we split into 3 bins by sum of amount column:

bin                          sum
threshold_1(x value-x value) 1000
threshold_2(x-x)             1000
threshold_3(x-x)             1000

something like this, but i need sum value instead of count

pd.cut(amount, 3).value_counts()

maybe it could be solved in python, not only via pandas?

Upvotes: 0

Views: 920

Answers (1)

jezrael
jezrael

Reputation: 862581

If need approximately an equal amount aggregate sum with pd.cut:

df = df.groupby(pd.cut(df.amount, 3)).sum()
print (df)
                 amount
amount                 
(199.49, 370.0]     790
(370.0, 540.0]      900
(540.0, 710.0]     1310

Upvotes: 1

Related Questions