Dylan Smith
Dylan Smith

Reputation: 118

Groupby bins on multiple items

My dataframe looks something like this (but with about 100,000 rows of data):

ID,Total,TotalDate,DaysBtwRead,Type,YearlyAvg 
1,1250,6/2/2017,17,AT267,229 
2,1670,2/3/2012,320,PQ43,50

I'm trying to groupby yearly average totals using

df.groupby(pd.cut(df['YearlyAvg'], np.arange(0,1250,50))).count()

so that I can set-up for a unique monte carlo distribution, but I need these grouped by each individual Type as well. This currently only counts each range regardless of any other values.

Rather than having an overall aggregate count, I'm trying to set my code up so that the output looks more like the following (with YearlyAvg containing a count of each range)

Index,YearlyAvg
AT267(0, 50], 200  
PQ43(0, 50], 123    
AT267(50, 100], 49   
PQ43(50, 100], 67

Is there an easier way to do this outside of creating a separate dataframe for each Type value?

Upvotes: 1

Views: 728

Answers (1)

BENY
BENY

Reputation: 323226

You can using unstack with stack

df['bins']=pd.cut(df['YearlyAvg'], np.arange(0,1250,50))
df.groupby(['Type','bins']).size().unstack(fill_value=0).stack()# also here will create the multiple index for achieve what you need 
Out[1783]: 
Type   bins      
AT267  (0, 50]       0
       (200, 250]    1
PQ43   (0, 50]       1
       (200, 250]    0
dtype: int64

Upvotes: 2

Related Questions