sgerbhctim
sgerbhctim

Reputation: 3640

Apply groupby on a DataFrame to display cumulative stats

Let's say I have a DataFrame that looks like this:

Bank Name     House     This Wk
Barc          Germany   100
Barc          UK        300
Barc          UK        500
JPM           Japan     200
JPM           NYC       100
BOA           LA        900
BOA           LA        50
BOA           LA        50
DB            Italy     45

I would like to group-by Bank Name, while outputting the largest House Value as well as the total value...

For example, using the example above would result in:

Bank Name     Total     House     This Wk
Barc          900       UK        500
JPM           300       Japan     200
BOA           1000      LA        900
DB            45        Italy     45

Essentially, it is grouping the Total by Bank Name, but also outputting the largest contributor, House, to the total and the amount contributed is This Wk.

How can I go about doing this?

Upvotes: 0

Views: 81

Answers (3)

cs95
cs95

Reputation: 402263

You can consider df.groupby with a list of dfGroupBy.agg functions:

In [732]: out = df.groupby('Bank Name')['This Wk'].agg(['sum', 'idxmax', 'max'])\
               .rename(columns={'sum' : 'Total', 'idxmax' : 'House', 'max' : 'This Wk'})\
               .reset_index()


In [734]: out['House'] = df.loc[out['House'], 'House'].values; out
Out[734]: 
  Bank Name  Total  House  This Wk
0       BOA   1000     LA      900
1      Barc    900     UK      500
2        DB     45  Italy       45
3       JPM    300  Japan      200

Upvotes: 3

Zero
Zero

Reputation: 76917

Another way using apply would be

In [17]: (df.groupby('Bank Name', sort=False)
            .apply(lambda x: pd.Series(
                             [x['This Wk'].sum(), 
                              x.loc[x['This Wk'].idxmax(), 'House'], 
                              x['This Wk'].max()], 
                   index=['Total', 'House', 'This Wk']))
            .reset_index())
Out[17]:
  Bank Name  Total  House  This Wk
0      Barc    900     UK      500
1       JPM    300  Japan      200
2       BOA   1000     LA      900
3        DB     45  Italy       45

Upvotes: 0

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210822

In [121]: df.groupby('Bank Name', group_keys=False) \
     ...:   .apply(lambda x: x.nlargest(1, 'This Wk').assign(Total=x['This Wk'].sum())) \
     ...:   [['Bank Name','Total','House','This Wk']]
     ...:
Out[121]:
  Bank Name  Total  House  This Wk
5       BOA   1000     LA      900
2      Barc    900     UK      500
8        DB     45  Italy       45
3       JPM    300  Japan      200

Upvotes: 5

Related Questions