Reputation: 579
I would like to groupby my pandas dataframe based on a given range condition. For example:
Number, Amount
1, 5
2, 10
3, 11
4, 3
5, 5
6, 8
7, 9
8, 6
Range conditions:
1 till 4 (included), named A: 5+10+11+3 = 29
5 and higher, named B: 5+8+9+6 = 28
Desired outcome:
Number, Amount
A, 29
B, 28
EDIT: Thanks for the great solutions. I would like to be able to add more range conditions if needed, so 5-7 and 8 alone should also be a possibility. Is it possible to keep that flexible?
Upvotes: 4
Views: 875
Reputation: 5012
Here you go:
import numpy as np
nos = df[(df.Number.between(1, 4)) | (df.Number >= 5)]
nos.groupby(np.where(nos.Number <= 4, 'A', 'B'))\
.Amount.sum().reset_index()\
.rename(columns={'index': 'Number'})
Output
Number Amount
0 A 29
1 B 28
Upvotes: 0
Reputation: 153550
You can use pd.cut
also, helpful if you more that two labels and ranges:
df.groupby(pd.cut(df['Number'],
bins=[0,4,np.inf],
labels=['A', 'B']))['Amount'].sum().reset_index()
Output:
Number Amount
0 A 29
1 B 28
Upvotes: 4
Reputation: 25269
Using groupby with np.where
df.groupby(np.where(df.Number <= 4, 'A', 'B')).Amount.sum().reset_index()
Out[16]:
index Amount
0 A 29
1 B 28
If you need matching column name, just chain rename_axis
df_final = (df.groupby(np.where(df.Number <=4, 'A', 'B')).Amount.sum()
.rename_axis('Number')
.reset_index())
Out[25]:
Number Amount
0 A 29
1 B 28
Upvotes: 2
Reputation: 150815
You can check values of Number
, map it to A,B
then groupby:
(df.Amount.groupby(df.Number.ge(5).map({True:'B',False:'A'}))
.sum()
.reset_index()
)
Output:
Number Amount
0 A 29
1 B 28
Upvotes: 4