Johannes
Johannes

Reputation: 157

Multi-index - get maximum of second index for each first index

I have dataframe with a multi-index which looks like the following:

ts                   ts2
2018-05-24 23:00:00  2018-05-24 00:00:00  5
                     2018-05-24 00:15:00  4 
                     2018-05-24 00:30:00  6 
                     2018-05-24 00:45:00  7

I would like to get for each datetime in index 1 the maximum value of index 2:

2018-05-24 23:00:00  2018-05-24 00:45:00  7

This cant be so difficult but somehow I cant manage it.

I tried a for loop slicing

for ts, new_df in df.groupby(level=0):
   print(new_df)

which leads to a new dataframe per ts .... Furthermore, I am able to obtain the maximum index of the new_df

new_df.index.max()

With these info, I should be able to select the correct value (index pair) and build the desired dataframe. However, I cant manage it.....

any help appreciated.

Thank you

Upvotes: 1

Views: 835

Answers (1)

jezrael
jezrael

Reputation: 863791

You can use idxmax with reset_index and select by positions with iloc:

df = df.iloc[df.reset_index().groupby('ts')['ts2'].idxmax()]
print (df)
                                         val
ts                  ts2                     
2018-05-24 23:00:00 2018-05-24 00:45:00    7

Or use boolean indexing with Index.duplicated, but first sort_index:

df = df.sort_index()
df = df[~df.index.get_level_values(0).duplicated(keep='last')]
print (df)
                                         val
ts                  ts2                     
2018-05-24 23:00:00 2018-05-24 00:45:00    7

Upvotes: 1

Related Questions