Reputation: 477
I have a data frame as follows:
df = pd.DataFrame({'year': [2010, 2011, 2012, 2015,2016,2017],
'sales': [10, 12, 13, 9, 11,7],
'Groups': ['AA', 'BB', 'AA', 'AA', 'CC', 'CC']})
what I am trying to do is to map the 'Groups' column with an integer index value so the same group members assigned the same index number. Somrthing like this:
Index year sales Groups
1 2010 10 AA
2 2011 12 BB
1 2012 13 AA
1 2015 9 AA
3 2016 11 CC
3 2017 7 CC
I was thinking to use set_index, but not sure if that is the right approach.
what I am trying to do is to map the 'Groups' column with an index value so the same group members assigned the same index number. Something like this:
Index year sales Groups
1 2010 10 AA
2 2011 12 BB
1 2012 13 AA
1 2015 9 AA
3 2016 11 CC
3 2017 7 CC
Thanks for any help.
Upvotes: 1
Views: 264
Reputation: 168
Is there a reason you aren't sorting first?
Or else you can try this:
df = df.sort_values('Groups')
df['index'] = df['Groups'].rank(method='dense')
It will rank your groups and index them appropriately.
Upvotes: 1
Reputation: 323376
Using ngroup
df.index=df.groupby('Groups').ngroup()+1
Or factorize
and cat.codes
df.index=pd.factorize(df.Groups)[0]+1
df.index=df.Groups.astype('category').cat.codes+1
Upvotes: 2