gibbz00
gibbz00

Reputation: 1987

Correcting the sort order of pandas index

I have a dataframe that looks like the following. My Date field is of dtype datetime64[ns]:

           symbol        high         low
Date                                      
2018-08-16     spy  285.040009  283.359985
2018-08-17     spy  285.559998  283.369995
2018-08-16    nflx  331.170013  321.209991
2018-08-17    nflx  324.369995  312.959991
2017-07-17     spy  245.910004  245.330002
2017-07-18     spy  245.720001  244.669998

My goal is to set an index by symbol first and then by Date like the following:

                          high         low
symbol Date 
spy     2017-07-17  245.910004  245.330002
        2017-07-18  245.720001  244.669998                             
        2018-08-16  285.040009  283.359985
        2018-08-17  285.559998  283.369995
nflx    2018-08-16  331.170013  321.209991
        2018-08-17  324.369995  312.959991

Following is my attempt: After resetting Date index by doing, the output looks like the following:

df.reset_index(level=['Date'], inplace=True)

        Date symbol        high         low
0 2018-08-16     spy  285.040009  283.359985
1 2018-08-17     spy  285.559998  283.369995
2 2018-08-16    nflx  331.170013  321.209991
3 2018-08-17    nflx  324.369995  312.959991
4 2017-07-17     spy  245.910004  245.330002
5 2017-07-18     spy  245.720001  244.669998

And finally setting the index on symbol and Date which is returning the undesired output:

df.set_index(['symbol', 'Date'], inplace=True)

                          high         low
symbol Date                              
spy     2018-08-16  285.040009  283.359985
        2018-08-17  285.559998  283.369995
nflx    2018-08-16  331.170013  321.209991
        2018-08-17  324.369995  312.959991
spy     2017-07-17  245.910004  245.330002
        2017-07-18  245.720001  244.669998

Upvotes: 1

Views: 58

Answers (2)

Yuca
Yuca

Reputation: 6091

Not a fan of inplace, but try pd.sort_index():

df.reset_index(level=['Date'], inplace= True)
df.set_index(['symbol', 'Date'], inplace=True)
print(df.sort_index())

Output:

                        high         low
symbol Date                              
nflx   2018-08-16  331.170013  321.209991
       2018-08-17  324.369995  312.959991
spy    2017-07-17  245.910004  245.330002
       2017-07-18  245.720001  244.669998
       2018-08-16  285.040009  283.359985
       2018-08-17  285.559998  283.369995

Upvotes: 1

sacuL
sacuL

Reputation: 51405

IIUC, You can try using swaplevel followed by sort_index:

df.set_index('symbol', append=True).swaplevel().sort_index(level=[0,1],ascending=[False,True])

                         high         low
symbol Date                              
spy    2017-07-17  245.910004  245.330002
       2017-07-18  245.720001  244.669998
       2018-08-16  285.040009  283.359985
       2018-08-17  285.559998  283.369995
nflx   2018-08-16  331.170013  321.209991
       2018-08-17  324.369995  312.959991

Upvotes: 1

Related Questions