Reputation: 1940
Supposing that I have this MultiIndex dataframe called df
:
| |Value
Year |Month|
1992 | 1 | 3
| 2 | 5
| 3 | 8
-----------------
1993 | 1 | 2
| 2 | 7
----------------
1994 | 1 | 20
| 2 | 50
| 3 | 10
| 4 | 5
How do I select all years and max month for each of those years?
I'd like the following result:
| |Value
Year |Month|
1992 | 3 | 8
-----------------
1993 | 2 | 7
----------------
1994 | 4 | 5
I've tried to use
df.loc[(slice(None), [3, 2, 4]),:]
This works, but it's hard-coded. How do I set it to bring always the maximum month level instead of saying it manually?
My index are sorted, so it would be take the last month for each year.
I've also tried to use the .iloc
but it doesn't work with multiindex
>>> df.iloc[(slice(None), -1),:]
...
IndexingError: Too many indexers
...
Upvotes: 1
Views: 278
Reputation: 75080
you can group on the first level and take the last of the second level and then df.loc[]
:
df.loc[pd.DataFrame.from_records(df.index).groupby(0)[1].last().items()]
Value
Year Month
1992 3 8
1993 2 7
1994 4 5
Upvotes: 2