Sort columns in Pivot table with Pandas

Question

folks. I've been through all the questions related to sorting columns with pivot tables but couldn't find something just as I needed. I have a dataframe of this kind:

        Date  Moisture     Accum  Year  DayOfYear
0 2000-01-01  0.408640  0.408640  2000          1
1 2000-01-02  0.433425  0.842065  2000          2
2 2000-01-03  0.429745  1.271810  2000          3
3 2000-01-04  0.427589  1.699399  2000          4
4 2000-01-05  0.428700  2.128098  2000          5

And I created a Pivot table from it and calculated another column from the existing data:

mean1 = pd.pivot_table(c1, index = 'DayOfYear', columns = 'Year', values = 'Moisture')
mean1['Mean'] = mean1.mean(axis = 1)

I obtained something like this:

Year           2000      2001      2002  ...      2018      2019      Mean
DayOfYear                                ...                              
1          0.408640  0.433016  0.420326  ...  0.423164  0.328385  0.401896
2          0.433425  0.423607  0.414502  ...  0.419587  0.322804  0.398434
3          0.429745  0.418132  0.404171  ...  0.417384  0.318795  0.396913
4          0.427589  0.407190  0.394478  ...  0.420361  0.316989  0.398425
5          0.428700  0.401072  0.386432  ...  0.417026  0.313664  0.396777

I want to sort the values for each year, but I haven't been able to make it happen. I've tried this:

mean1 = mean1.sort_values('2000', ascending = True, axis = 0)

But I get KeyError: '2000'. I also tried sorting by the value I made the pivot table from ('Moisture'), as responses to other questions recommended, but it keeps showing a similar error. If I try to sort the values from the 'Mean' column I do get the sorted column, but it can't be done (apparently) for the year columns (e.g. '2000'). What am I missing?

Ralubrusto · Accepted Answer

If you want to sort every column independently, you may try this solution, which is the same as:

import numpy as np

mean2 = pd.DataFrame(np.sort(mean1.values, axis=0), index=mean1.index, columns=mean1.columns)

Which will give you a DataFrame with each column sorted individually.

Year           2000      2001      2002      2018      2019      Mean
DayOfYear                                                            
1          0.408640  0.401072  0.386432  0.417026  0.313664  0.389379
2          0.427589  0.407190  0.394478  0.417384  0.316989  0.393321
3          0.428700  0.418132  0.404171  0.419587  0.318795  0.397645
4          0.429745  0.423607  0.414502  0.420361  0.322804  0.402706
5          0.433425  0.433016  0.420326  0.423164  0.328385  0.402785

But now the index doesn't make sense at all, since all cells were reordered. So maybe you'll want to reindex it.

Sort columns in Pivot table with Pandas

Answers (2)

Related Questions