CiaranWelsh
CiaranWelsh

Reputation: 7681

How to correctly sort a multi-indexed pandas DataFrame

I have a multi-indexed pandas dataframe that looks like this:

Antibody                 Time Repeats           
Akt                      0    1         1.988053
                              2         1.855905
                              3         1.416557
                         5    1         1.143599
                              2         1.151358
                              3         1.272172
                         10   1         1.765615
                              2         1.779330
                              3         1.752246
                         20   1         1.685807
                              2         1.688354
                              3         1.614013
                         .....        ....
                         0    4         2.111466
                              5         1.933589
                              6         1.336527
                         5    4         2.006936
                              5         2.040884
                              6         1.430818
                         10   4         1.398334
                              5         1.594028
                              6         1.684037
                         20   4         1.529750
                              5         1.721385
                              6         1.608393

(Note that I've only posted one antibody, there are many analogous entries under the antibody index) but they all have the same format. Despite missing out the entries in the middle for the sake of space you can see that I have 6 experimental repeats but they are not organized properly. My question is: how would I get the DataFrame to aggregate all the repeats. So the output would look something like this:

Antibody                 Time Repeats           
Akt                      0    1         1.988053
                              2         1.855905
                              3         1.416557
                              4         2.111466
                              5         1.933589
                              6         1.336527
                         5    1         1.143599
                              2         1.151358
                              3         1.272172
                              4         2.006936
                              5         2.040884
                              6         1.430818
                         10   1         1.765615
                              2         1.779330
                              3         1.752246
                              4         1.398334
                              5         1.594028
                              6         1.684037
                         20   1         1.685807
                              2         1.688354
                              3         1.614013
                              4         1.529750
                              5         1.721385
                              6         1.60839
                         .....        ....

Thanks in advance

Upvotes: 2

Views: 79

Answers (1)

jezrael
jezrael

Reputation: 862611

I think you need sort_index:

df = df.sort_index(level=[0,1,2])
print (df)
Antibody  Time  Repeats
Akt       0     1          1.988053
                2          1.855905
                3          1.416557
                4          2.111466
                5          1.933589
                6          1.336527
          5     1          1.143599
                2          1.151358
                3          1.272172
                4          2.006936
                5          2.040884
                6          1.430818
          10    1          1.765615
                2          1.779330
                3          1.752246
                4          1.398334
                5          1.594028
                6          1.684037
          20    1          1.685807
                2          1.688354
                3          1.614013
                4          1.529750
                5          1.721385
                6          1.608393
Name: col, dtype: float64

Or you can omit parameter levels:

df = df.sort_index()
print (df)
Antibody  Time  Repeats
Akt       0     1          1.988053
                2          1.855905
                3          1.416557
                4          2.111466
                5          1.933589
                6          1.336527
          5     1          1.143599
                2          1.151358
                3          1.272172
                4          2.006936
                5          2.040884
                6          1.430818
          10    1          1.765615
                2          1.779330
                3          1.752246
                4          1.398334
                5          1.594028
                6          1.684037
          20    1          1.685807
                2          1.688354
                3          1.614013
                4          1.529750
                5          1.721385
                6          1.608393
Name: col, dtype: float64

Upvotes: 2

Related Questions