Jordan
Jordan

Reputation: 1495

How to group by key and only return observation by max from pandas dataframe

How does one return all observations from a dataframe that only hold the max value for each unique key associated? I tried groupby and max but for every difference I get more than on value per key returned back. See example below:

import pandas as pd
key = [111,111,222,333,444,555]
flag = [0,1,1,1,1,1]
date = [pd.to_datetime('2020-01-01'),pd.to_datetime('2020-01-02'), 
        pd.to_datetime('2020-02-01'),pd.to_datetime('2020-04-01'),
        pd.to_datetime('2020-03-01'),pd.to_datetime('2020-05-01')]

df_dic = {'key':key, 'flag':flag, 'date':date}
df = pd.DataFrame(df_dic)
df
df.groupby(['key', 'flag']).agg({'date':'max'}).reset_index()

This returns:

key  flag       date
0  111     0 2020-01-01
1  111     1 2020-01-02
2  222     1 2020-02-01
3  333     1 2020-04-01

I want to return only the observations with the max date per unique key like so:

key  flag       date
0  111     1 2020-01-02
1  222     1 2020-02-01
2  333     1 2020-04-01

Upvotes: 1

Views: 232

Answers (2)

Ben.T
Ben.T

Reputation: 29635

you can avoid groupby and use sort_values and drop_dplicates.

print(df.sort_values('date').drop_duplicates('key', keep='last'))
   key  flag       date
1  111     1 2020-01-02
2  222     1 2020-02-01
4  444     1 2020-03-01
3  333     1 2020-04-01
5  555     1 2020-05-01

Note that you can add the key column in the sort_values if the order matters to you in the result df.sort_values(['key', 'date'])...

Upvotes: 1

Ric S
Ric S

Reputation: 9257

One possible solution (although maybe not the best one) is to use an inner join after the groupby method. This ensures that the flags are taken from the original dataframe, either they are 0 or 1.

df.groupby(['key'])\
    .agg({'date': 'max'})\
    .reset_index()\
    .merge(df, how='inner', on=['key', 'date'])

#    key       date  flag
# 0  111 2020-01-02     1
# 1  222 2020-02-01     1
# 2  333 2020-04-01     1
# 3  444 2020-03-01     1
# 4  555 2020-05-01     1

Upvotes: 1

Related Questions