Sorting by two columns in pandas series

Question

Putting a slight variation on a question I previously asked. I managed to get a solution to sorting values by a particular column in my pandas series. However, the problem is that sorting purely by time doesn't allow me to factor in different dates in which the time occurred. I understand that I could potentially hard code the order and use .loc to apply the order but wanted to find out if there was a simpler solution to sort primarily by week (earliest week first) and by time (0-23hours for each week).

Here is a sample of the dataframe I have again:

weeknum   time_hour
16-22Jun  0.0           5
2-8Jun    0.0           3
23-29Jun  0.0          11
9-15Jun   0.0           3
16-22Jun  1.0           3
2-8Jun    1.0           6
23-29Jun  1.0           3
9-15Jun   1.0           8
16-22Jun  2.0           3
2-8Jun    2.0           6
23-29Jun  2.0           3
16-22Jun  3.0           4
2-8Jun    3.0           2
23-29Jun  3.0           3
9-15Jun   3.0           4
16-22Jun  4.0           2
2-8Jun    4.0           7
23-29Jun  4.0           1
9-15Jun   4.0           7
16-22Jun  5.0           2
2-8Jun    5.0           9
23-29Jun  5.0           9
9-15Jun   5.0          12
16-22Jun  6.0           5
2-8Jun    6.0          12
23-29Jun  6.0           6
9-15Jun   6.0          14
16-22Jun  7.0          12
2-8Jun    7.0          17
23-29Jun  7.0          19

This is my code:

merged_clean.groupby('weeknum')['time_hour'].value_counts().sort_index(level=['time_hour'])

jezrael · Accepted Answer

Use function sorted by multiple keys for sorting MultiIndex with convert first number before - and for change order use DataFrame.reindex:

s = merged_clean.groupby('weeknum')['time_hour'].value_counts()
idx = sorted(s.index, key = lambda x: (int(x[0].split('-')[0]), x[1]))
s = s.reindex(idx)

print (s)
weeknum   time_hour
2-8Jun    0.0           3
          1.0           6
          2.0           6
          3.0           2
          4.0           7
          5.0           9
          6.0          12
          7.0          17
9-15Jun   0.0           3
          1.0           8
          3.0           4
          4.0           7
          5.0          12
          6.0          14
16-22Jun  0.0           5
          1.0           3
          2.0           3
          3.0           4
          4.0           2
          5.0           2
          6.0           5
          7.0          12
23-29Jun  0.0          11
          1.0           3
          2.0           3
          3.0           3
          4.0           1
          5.0           9
          6.0           6
          7.0          19
Name: a, dtype: int64

Sorting by two columns in pandas series

Answers (1)

Related Questions