RML
RML

Reputation: 53

Pandas: how to get rows with max timestamp for groupby

Given a DataFrame I want to make a new DataFrame for rows with the max timestamp, for a combo of columns

Combo: category, revision, type, subtype

sub_type may/not have a value (but the None is part of its unique-ness)

I won't have duplicates based on the above (no ties on timestamp)

    action  category   YYYYMM      timestamp      sub_type       type 
0     buy      A       202002           4            None        apple 
1     sell     A       202002           5            None        apple 
2     buy      A       202002           4            green       apple 
3     buy      A       202002           4            red         apple 
4     sell     A       202002           3            red         apple 
5     sell     A       202002           1            None        orange
6     sell     B       202002           6            None        apple 

The result for the above DataFrame would look like this:

    action  category  revision      timestamp      sub_type      type 
0     sell     A       202002           5            None        apple 
1     buy      A       202002           4            green       apple 
2     buy      A       202002           4            red         apple 
3     sell     A       202002           1            None        orange
4     sell     B       202002           6            None        apple 

Basically - I want the last action for a combo of attributes

Upvotes: 0

Views: 1699

Answers (1)

BENY
BENY

Reputation: 323226

So we need use fillna here ,since None == None will return True. After that we can do sort_values then drop_duplicates

out = df.sort_values('timestamp').fillna('None').\ 
            drop_duplicates(['category','sub_type','YYYYMM','type'],keep='last').\
               sort_index()

out
Out[128]: 
  action category  YYYYMM  timestamp sub_type    type
1   sell        A  202002          5     None   apple
2    buy        A  202002          4    green   apple
3    buy        A  202002          4      red   apple
5   sell        A  202002          1     None  orange
6   sell        B  202002          6     None   apple

Upvotes: 2

Related Questions