Reputation: 413
I would like to set index for a data frame using a column with duplicated values. Is there any way that Pandas can automatically add a second index so that when the first index is duplicated then the second index will be increased?
For example:
ID name company position
------------------------------------------------
0 23 Alex Monoson Coobit Sales manager
1 12 Johnny Johnson Coobit Marketing manager
2 62 Hans Dupa Pesik Marketing manager
3 31 Jessica Heiler Montino Engineer
4 92 Dominic Alvorine Montino CFO
5 16 Hei Lee Coobit CEO
I would like to use company
as index and there will be another integer index column
My expected output:
ID name position
company
------------------------------------------
Coobit 0 blah blah blah
Coobit 1 blah blah blah
Coobit 2 blah blah blah
Pesik 0 blah blah blah
Montino 0 blah blah blah
Montino 1 blah blah blah
Upvotes: 0
Views: 32
Reputation: 323226
We can use cumcount
df['index2']=df.groupby('company').cumcount()
df=df.set_index(['company','index2']).sort_index()
Upvotes: 1