Reputation: 2871

Groupby sum and divide in pandas

I have a df as shown below:

Place       Occupancy     Number
Bangalore   Occupied      80    
Bangalore   Vacant       20
Chennai     Occupied      90
Chennai     Vacant       60
Delhi       Occupied      20
Delhi       Vacant       20

I'm trying to generate the following:

Place         Occupancy_%     Total_Number   Number_vacant  Number_occupied
Bangalore     80              100            20             80
Chennai       60              150            60             90
Bangalore     50              40             20             20

Upvotes: 2

Answers (3)

Yaakov Bressler

Reputation: 12168

I would do this in 2 steps. First, generate a pivot table:

df_pivot = df.pivot(index='Place',columns='Occupancy',values='Number')

Occupancy  Occupied  Vacant
Place                      
Bangalore        80      20
Chennai          90      60
Delhi            20      20

Second, compute to achieve your desired outcome:

df_pivot['Total_Number'] = df_pivot[['Occupied','Vacant']].sum(axis=1)
df_pivot['Occupied_Pct'] = df_pivot['Occupied'] / df_pivot['Total_Number'] 
df_pivot['Vacant_Pct'] = df_pivot['Vacant'] / df_pivot['Total_Number'] 

Occupancy  Occupied  Vacant  Total_Number  Occupied_Pct  Vacant_Pct
Place                                                              
Bangalore        80      20           100           0.8         0.2
Chennai          90      60           150           0.6         0.4
Delhi            20      20            40           0.5         0.5

Upvotes: 1

BENY

Reputation: 323396

Let us do it with one-line by crosstab and assign

df=pd.crosstab(index=df.Place,
               columns=df.Occupancy,
               values=df.Number,
               aggfunc='sum',
               margins = True,
               margins_name ='Total_number' ).drop('Total_number').\
       assign(Occupancy=lambda  x : x['Occupied']*100/x['Total_number'] )
Out[128]: 
Occupancy  Occupied  Vacant  Total_number  Occupancy
Place                                               
Bangalore        80      20           100       80.0
Chennai          90      60           150       60.0
Delhi            20      20            40       50.0

Upvotes: 2

Quang Hoang

Reputation: 150825

You can do something like this:

new_df = df.pivot(index='Place', columns='Occupancy', values='Number')
new_df['Total_Number'] = new_df.sum(1)
new_df['Occupancy_%'] = new_df['Occupied']/new_df['Total_Number'] * 100

Output:

Occupancy  Occupied  Vacant  Total_Number  Occupancy_%
Place                                                 
Bangalore        80      20           100         80.0
Chennai          90      60           150         60.0
Delhi            20      20            40         50.0

Upvotes: 2

Groupby sum and divide in pandas

Answers (3)

Related Questions