Reputation: 1663
I have a pandas dataframe which looks like this:
value
Id
2014-03-13 1 -3
2 -6
3 -3.2
4 -3.1
5 -5
2014-03-14 1 -3.4
2 -6.2
3 -3.2
4 -3.2
5 -5.9
which is basically a groupby object with two levels of multi-index.
I want to sort it in ascending order according to the value
column, but keeping the level 0 (dates) untouched so that the result should look like this:
value
Id
2014-03-13 2 -6
5 -5
3 -3.2
4 -3.1
1 -3
2014-03-14 2 -6.2
5 -5.9
1 -3.4
3 -3.2
4 -3.2
Here is the code to generate the initial data:
import pandas as pd
dates = [pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-13', format='%Y-%m-%d'), pd.to_datetime('2014-03-13', format='%Y-%m-%d'),
pd.to_datetime('2014-03-13', format='%Y-%m-%d'),pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d'),
pd.to_datetime('2014-03-14', format='%Y-%m-%d'), pd.to_datetime('2014-03-14', format='%Y-%m-%d')]
values = [-3,-6,-3.2,-3.1,-5,-3.4,-6.2,-3.2,-3.2,-5.9]
Ids = [1,2,3,4,5,1,2,3,4,5]
df = pd.DataFrame({'Id': pd.Series(Ids, index=dates),
'value': pd.Series(values, index=dates)})
df = df.groupby([df.index,'Id']).sum()
Upvotes: 5
Views: 6423
Reputation: 863166
For me works reset_index
+ sort_values
+ set_index
+ rename_axis
:
df = df.reset_index() \
.sort_values(['level_0','value']) \
.set_index(['level_0','Id']) \
.rename_axis([None, 'Id'])
print (df)
value
Id
2014-03-13 2 -6.0
5 -5.0
3 -3.2
4 -3.1
1 -3.0
2014-03-14 2 -6.2
5 -5.9
1 -3.4
3 -3.2
4 -3.2
Another solution with sort_values
+ swaplevel
+ sort_index
:
df = df.sort_values('value')
.swaplevel(0,1)
.sort_index(level=1, sort_remaining=False)
.swaplevel(0,1)
print (df)
value
Id
2014-03-13 2 -6.0
5 -5.0
3 -3.2
4 -3.1
1 -3.0
2014-03-14 2 -6.2
5 -5.9
1 -3.4
3 -3.2
4 -3.2
Swap levels is necessary because:
print (df.sort_values('value').sort_index(level=0, sort_remaining=False))
value
Id
2014-03-13 1 -3.0
2 -6.0
3 -3.2
4 -3.1
5 -5.0
2014-03-14 1 -3.4
2 -6.2
3 -3.2
4 -3.2
5 -5.9
For pandas 0.23.0
is possible sort columns and index levels together:
df.index.names = ['level1','level2']
print (df.sort_values(['level1','value']))
value
level1 level2
2014-03-13 2 -6.0
5 -5.0
3 -3.2
4 -3.1
1 -3.0
2014-03-14 2 -6.2
5 -5.9
1 -3.4
3 -3.2
4 -3.2
Upvotes: 9
Reputation: 8996
To my knowledge, a simultaneous sort on both index and column isn't possible, but a simple workaround would be the following:
df = df.reset_index().sort_values(by = ['level_0','values']).set_index(['level_0','Id'])
...and if you need to get rid of the 'level_0' index label:
df.index.names = [None, 'Id']
Setup:
import pandas as pd
import io
c = io.StringIO(u'''
Id value
2014-03-13 1 -3
2014-03-13 2 -6
2014-03-13 3 -3.2 2014-03-13 4 -3.1
2014-03-13 5 -5
2014-03-14 1 -3.4
2014-03-14 2 -6.2
2014-03-14 3 -3.2
2014-03-14 4 -3.2
2014-03-14 5 -5.9
''')
df = pd.read_csv(c, delim_whitespace = True)
df = df.groupby([df.index,'Id']).max()
Initial df:
value
Id
2014-03-13 1 -3.0
2 -6.0
3 -3.2
4 -3.1
5 -5.0
2014-03-14 1 -3.4
2 -6.2
3 -3.2
4 -3.2
5 -5.9
Ouput:
value
Id
2014-03-13 2 -6.0
5 -5.0
3 -3.2
4 -3.1
1 -3.0
2014-03-14 2 -6.2
5 -5.9
1 -3.4
3 -3.2
4 -3.2
Upvotes: 0