Reputation: 157
I have dataframe with a multi-index which looks like the following:
ts ts2
2018-05-24 23:00:00 2018-05-24 00:00:00 5
2018-05-24 00:15:00 4
2018-05-24 00:30:00 6
2018-05-24 00:45:00 7
I would like to get for each datetime in index 1 the maximum value of index 2:
2018-05-24 23:00:00 2018-05-24 00:45:00 7
This cant be so difficult but somehow I cant manage it.
I tried a for loop slicing
for ts, new_df in df.groupby(level=0):
print(new_df)
which leads to a new dataframe per ts .... Furthermore, I am able to obtain the maximum index of the new_df
new_df.index.max()
With these info, I should be able to select the correct value (index pair) and build the desired dataframe. However, I cant manage it.....
any help appreciated.
Thank you
Upvotes: 1
Views: 835
Reputation: 863791
You can use idxmax
with reset_index
and select by positions with iloc
:
df = df.iloc[df.reset_index().groupby('ts')['ts2'].idxmax()]
print (df)
val
ts ts2
2018-05-24 23:00:00 2018-05-24 00:45:00 7
Or use boolean indexing
with Index.duplicated
, but first sort_index
:
df = df.sort_index()
df = df[~df.index.get_level_values(0).duplicated(keep='last')]
print (df)
val
ts ts2
2018-05-24 23:00:00 2018-05-24 00:45:00 7
Upvotes: 1