User_13
User_13

Reputation: 21

Sorting rows in multi-index dataframe while keeping secondary index in tact

I have the following multi-index dataframe set up:

                              created_at  ... compound
conv_nr elem_nr                            ...         
0       0       2020-03-30 18:41:32+00:00  ...   0.7184
        1       2020-03-30 18:31:47+00:00  ...  -0.0003
1       0       2020-03-30 18:35:15+00:00  ...  -0.3612
        1       2020-03-30 18:34:00+00:00  ...   0.1877
        2       2020-03-30 18:29:36+00:00  ...  -0.1027
...                                   ...  ...      ...
29071   1       2019-05-22 12:58:12+00:00  ...   0.0000
29072   0       2019-05-22 13:20:31+00:00  ...  -0.6619
        1       2019-05-22 12:58:12+00:00  ...   0.0000
29073   0       2019-05-22 13:20:05+00:00  ...   0.7506
        1       2019-05-22 12:58:12+00:00  ...   0.0000

I am struggling to order this so that the time stamps are in ascending order and that "elem_nr" is also reindexed accordingly (stays in tact). Consider when conv_nr = 0. The outcome should then be:

                               created_at  ... compound
conv_nr elem_nr                            ...         
0       0       2020-03-30 18:31:47+00:00  ...  -0.0003
        1       2020-03-30 18:41:32+00:00  ...   0.7184

So essentially I need to get ascending timestamps while ensuring the "elem_nr" does not get flipped/stays in place.

Upvotes: 0

Views: 46

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35636

IIUC try using .values or .to_numpy() to prevent pandas from affecting the index:

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'conv_nr': [0, 0, 1, 1, 1],
    'elem_nr': [0, 1, 0, 1, 2, ],
    'created_at': ['2020-03-30 18:41:32+00:00',
                   '2020-03-30 18:31:47+00:00',
                   '2020-03-30 18:35:15+00:00',
                   '2020-03-30 18:34:00+00:00',
                   '2020-03-30 18:29:36+00:00'],
    'compound': [0.7184, -0.0003, -0.3612, 0.1877, -0.1027]
})
df['created_at'] = pd.to_datetime(df['created_at'])
df = df.set_index(['conv_nr', 'elem_nr'])

# Use df.column accessor to select all columns
# Use .values to aligning by index
df[df.columns] = df.sort_values(
    ['conv_nr', 'created_at'], ascending=True
).values
print(df)

df:

                               created_at  compound
conv_nr elem_nr                                    
0       0       2020-03-30 18:31:47+00:00   -0.0003
        1       2020-03-30 18:41:32+00:00    0.7184
1       0       2020-03-30 18:29:36+00:00   -0.1027
        1       2020-03-30 18:34:00+00:00    0.1877
        2       2020-03-30 18:35:15+00:00   -0.3612

Upvotes: 1

Related Questions