Reputation: 27
I have created a new aggregated dataframe by using groupby and I am having problems with adding a subtotal row under each category.
I have tried using pd.groupby and pivottable and changing the index but I don't manage to represent the data as I want it.
df_balance['Subtotal'] = df_balance.groupby('Client')['USD_Balance'].transform('sum')
+----------+-------------+------------+
|CLient ID | USD_Balance | Subtotal |
+----------+---------+------------+----
| 1 | 2 | 6 |
| 1 | 2 | 6 |
| 1 | 2 | 6 |
+----------+-------------+------------+
|---------------------|------------------|
| Client ID | USD_Balance |
|---------------------|------------------|
| 1 | 2 |
|---------------------|------------------|
| 1 | 2 |
|---------------------|------------------|
| 1 | 2 |
|---------------------|------------------|
| SubTotal | 6 |
|---------------------|------------------|
I would like to add a Subtotal row, with the corresponding agg per Client ID group.
Thanks in advance for any pointers on how to present my data like this!
Upvotes: 2
Views: 6811
Reputation: 153460
Are you okay with doing it a little different way?
dftotal = df.groupby('CLient ID')['USD_Balance'].sum().reset_index()
dftotal['CLient ID'] = 'SubTotal'
pd.concat([df, dftotal])
Output:
CLient ID USD_Balance
0 1 2
1 1 2
2 1 2
0 SubTotal 6
Upvotes: 0
Reputation: 3739
sum_res= df.groupby(['CLient ID'],as_index=False)['USD_Balance'].sum()
sum_res['grand_total'] ='Grand Total'
df.sort_values(by=['CLient ID'],ascending=[True],inplace=True)
Separate two columns from original dataframe after sorting
res = df[['CLient ID','USD_Balance']]
final_res = pd.concat([res,sum_res])
final_res = final_res.sort_values(by=['CLient ID','grand_total'],ascending=[True,True],na_position='first')
final_res['CLient ID'] =np.where(final_res['grand_total'].isnull(),
final_res['CLient ID'],
final_res['grand_total'])
final_res.drop(['grand_total'],axis=1,inplace=True)
Upvotes: 1
Reputation: 42916
You can use groupby
and access each group and add an Subtotal row:
dfs = []
for _, d in df.groupby('CLient ID', as_index=False):
d.loc['Total', 'USD_Balance'] = df['USD_Balance'].sum()
dfs.append(d)
df_final = pd.concat(dfs, ignore_index=True)
CLient ID USD_Balance
0 1.0 2.0
1 1.0 2.0
2 1.0 2.0
3 NaN 6.0
Upvotes: 5