Reputation: 305
I have a dataframe as below:
A B C D
0 hola 32 43 54
1 hey 87 67 45
2 hi 10 54 89
3 hola 19 34 12
4 hi 11 59 09
I need to set a multilevel index using A and B , which is grouped by A I need the following dataframe
A B C D
hola 32 43 54
19 34 12
hey 87 67 45
hi 10 54 89
11 59 09
I have tried using df.set_index(['A','B']) and i get
A B C D
hola 32 43 54
hola 19 34 12
hey 87 67 45
hi 10 54 89
hi 11 59 09
Upvotes: 2
Views: 9520
Reputation: 7038
Sorting first is not necessary- you are setting the MultiIndex
but it is not lexsorted:
df.set_index(['A','B']).index.is_lexsorted()
False
As an alternate method, set the index as you already have and sort it after the fact:
df.set_index(['A','B']).sort_index()
C D
A B
hey 87 67 45
hi 10 54 89
11 59 9
hola 19 34 12
32 43 54
Upvotes: 0
Reputation: 19957
You need to sort first.
df.sort_values(['A','B']).set_index(['A','B'])
Out[60]:
C D
A B
hey 87 67 45
hi 10 54 89
11 59 9
hola 19 34 12
32 43 54
Upvotes: 5