Reputation: 429
I have some simple data in a dataframe consisting of three columns [id, country, volume] where the index is 'id'.
I can perform simple operations like:
df_vol.groupby('country').sum()
and it works as expected. When I attempt to use rank() it does not work as expected and the results is an empty dataframe.
df_vol.groupby('country').rank()
The result is not consistent and in some cases it works. The following also works as expected:
df_vol.rank()
I want to return something like:
vols = []
for _, df in f_vol.groupby('country'):
vols.append(df['volume'].rank())
pd.concat(vols)
Any ideas why much appreciated!
Upvotes: 2
Views: 2549
Reputation: 863236
You can add column by []
- function is call only for column Volume
:
df_vol.groupby('country')['volume'].rank()
Sample:
df_vol = pd.DataFrame({'country':['en','us','us','en','en'],
'volume':[10,10,30,20,50],
'id':[1,1,1,2,2]})
print(df_vol)
country id volume
0 en 1 10
1 us 1 10
2 us 1 30
3 en 2 20
4 en 2 50
df_vol['r'] = df_vol.groupby('country')['volume'].rank()
print (df_vol)
country id volume r
0 en 1 10 1.0
1 us 1 10 1.0
2 us 1 30 2.0
3 en 2 20 2.0
4 en 2 50 3.0
Upvotes: 5