user11427018
user11427018

Reputation: 139

Pandas: Groupby multiple columns, finding the max value and keep other columns in dataframe

I am looking to group multiple columns in a dataframe, keep only the max Value, and keeping the corresponding date column

Below is how the dataframe looks like:

Index Site Device Type Value Time
0 AAA A 10 2021-02-02 01:30:00
1 AAA A 5 2021-02-02 01:35:00
2 AAA B 2 2021-02-02 01:40:00
3 BBB C 3 2021-02-02 02:00:00
4 BBB C 11 2021-02-02 02:05:00
5 BBB C 20 2021-02-02 02:10:00
6 BBB D 30 2021-02-02 04:00:00

I am trying to get the following output:

Index Site Device Type Value Time
0 AAA A 10 2021-02-02 01:30:00
1 AAA B 2 2021-02-02 01:40:00
2 BBB C 20 2021-02-02 02:10:00
3 BBB D 30 2021-02-02 04:00:00

When I try the following groupby, the Time column drops: df_max = df.groupby(['Site','Device Type'],as_index=False)['Value'].max()

I am looking to Keep the Time value corresponding to the maximum value found.

Thank you

Upvotes: 7

Views: 8896

Answers (1)

Pygirl
Pygirl

Reputation: 13349

You are very close. try using idxmax and show rows at that location:

df.loc[df.groupby(['Site','Device Type'])['Value'].idxmax()].reset_index(drop=True)

    Index   Site    Device Type Value   Time
0   0       AAA     A           10      2021-02-02 01:30:00
1   2       AAA     B           2       2021-02-02 01:40:00
2   5       BBB     C           20      2021-02-02 02:10:00
3   6       BBB     D           30      2021-02-02 04:00:00

Upvotes: 11

Related Questions