Reputation: 69
I'm having a dataframe with multiindex, the first level is an company_ID and the second level is a timestamp. How can I get a rank of all companies depending on their scores, every month?
Score
company_idx timestamp
10006 2010-01-31 69.875394
2010-11-30 73.640693
2010-12-31 73.286248
2011-01-31 73.660052
2011-02-28 74.615564
2011-03-31 73.535187
2011-04-30 72.491390
2012-01-31 72.162768
2012-02-29 61.637952
2012-03-31 59.445419
2012-04-30 25.685615
2012-05-31 8.047693
2012-06-30 58.341200
...
9981 2016-12-31 51.011261
2018-05-31 54.462832
2018-06-30 57.126250
2018-07-31 54.695835
2018-08-31 63.758145
2018-09-30 63.255583
2018-10-31 62.069697
2018-11-30 62.795650
2018-12-31 63.045329
2019-01-31 60.276990
2019-02-28 56.666379
2019-03-31 57.903213
2019-04-30 57.558973
2019-05-31 52.260287
I've tried to do:
df2 = df.sort_index(by='Score', ascending=False)
But it's not getting me what i want. Would you be able to help? I'm quite new with multilevel dataframes. Many thanks!
Upvotes: 0
Views: 36
Reputation: 149075
You should swap the index levels to have the month first, then sort by timestamp ascending and Score descending:
df.index = df.index.swaplevel()
df.sort_values(['timestamp', 'Score'], ascending=[True, False], inplace=True)
It does not give interesting result with your sample value, because only one company has Score value for one month.
To extract the values for one month, you can use df.xs(month_value, level=0)
that will drop one level in the multi-index, or df.xs(month_value, level=0, drop_level=False)
that will keep it.
Upvotes: 2