Reputation: 55
I have a multi-indexed DF with the following structure:
>>> df = pd.DataFrame({(2014, 'value'): {('AR', 0): 1.2420, ('AR', 1): 0.1802,('BR', 0): 1.3,('BR', 1): 0.18}})
>>> print df
2014
value
AR 0 1.2420
1 0.1802
BR 0 1.3000
1 0.1800
My goal is to add a column 'rank', that contains the ranking of the countries (AR & BR) for 0 & 1 in descending order. The desired result would be something like:
2014
value rank
iso id
AR 0 1.2420 2
1 0.1802 1
BR 0 1.3 1
1 0.18 2
My initial approach was to reset the index:
>>> df = df.reset_index()
>>> print df
level_0 level_1 2014
value
0 AR 0 1.2420
1 AR 1 0.1802
2 BR 0 1.3000
3 BR 1 0.1800
And then add the 'rank' column using a groupby and rank:
>>> df[2014, 'gr'] = df.groupby(['level_1'])[2014, 'value'].rank(ascending=False)
This results, however, in:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.py", line 2990, in __getitem__
if len(self.obj.columns.intersection(key)) != len(key):
File "/usr/local/lib/python2.7/dist-packages/pandas/core/index.py", line 3774, in intersection
result_names = self.names if self.names == other.names else None
AttributeError: 'tuple' object has no attribute 'names'
Am I on the right track, another approach I should consider?
Upvotes: 2
Views: 1862
Reputation: 28946
So rank is coming from value
right? I think this is what you want:
In [13]: df.groupby(level=1).rank(ascending=False)
Out[13]:
2014
value
AR 0 2
1 1
BR 0 1
1 2
which you can set with df['rank'] = df.groupby(level=1).rank(ascending=False)
Upvotes: 6