Olaf
Olaf

Reputation: 55

Ranking of multi-indexed DF

I have a multi-indexed DF with the following structure:

>>> df = pd.DataFrame({(2014, 'value'): {('AR', 0): 1.2420, ('AR', 1): 0.1802,('BR', 0): 1.3,('BR', 1): 0.18}})
>>> print df

      2014
      value
AR 0  1.2420
   1  0.1802
BR 0  1.3000
   1  0.1800

My goal is to add a column 'rank', that contains the ranking of the countries (AR & BR) for 0 & 1 in descending order. The desired result would be something like:

            2014          
            value   rank 
iso   id
AR    0     1.2420  2      
      1     0.1802  1    
BR    0     1.3     1    
      1     0.18    2  

My initial approach was to reset the index:

>>> df = df.reset_index()
>>> print df

       level_0   level_1   2014
                           value
0      AR        0         1.2420
1      AR        1         0.1802
2      BR        0         1.3000
3      BR        1         0.1800

And then add the 'rank' column using a groupby and rank:

>>> df[2014, 'gr'] =  df.groupby(['level_1'])[2014, 'value'].rank(ascending=False)

This results, however, in:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 2990, in __getitem__
    if len(self.obj.columns.intersection(key)) != len(key):
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 3774, in intersection
    result_names = self.names if self.names == other.names else None
AttributeError: 'tuple' object has no attribute 'names'

Am I on the right track, another approach I should consider?

Upvotes: 2

Views: 1862

Answers (1)

TomAugspurger
TomAugspurger

Reputation: 28946

So rank is coming from value right? I think this is what you want:

In [13]: df.groupby(level=1).rank(ascending=False)
Out[13]: 
      2014
     value
AR 0     2
   1     1
BR 0     1
   1     2

which you can set with df['rank'] = df.groupby(level=1).rank(ascending=False)

Upvotes: 6

Related Questions