Reputation: 17468
Just show my data
In [14]: new_df
Out[14]:
action_type 1 2 3
user_id
0000110e00f7c85f550b329dc3d76210 31.0 4.0 0.0
00004931fe12d6f678f67e375b3806e3 8.0 4.0 0.0
0000c2b8660766ed74bafd48599255f0 0.0 2.0 0.0
0000d8d4ea411b05e0392be855fe9756 19.0 0.0 3.0
ffff18540a9567b455bd5645873e56d5 1.0 0.0 0.0
ffff3c8cf716efa3ae6d3ecfedb2270b 58.0 2.0 0.0
ffffa5fe57d2ef322061513bf60362ff 0.0 2.0 0.0
ffffce218e2b4af7729a4737b8702950 1.0 0.0 0.0
ffffd17a96348904fe49216ba3c7006f 1.0 0.0 0.0
[9 rows x 3 columns]
In [15]: new_df.columns
Out[15]: Int64Index([1, 2, 3], dtype='int64', name=u'action_type')
In [16]: new_df.index
Out[16]:
Index([u'0000110e00f7c85f550b329dc3d76210',
u'00004931fe12d6f678f67e375b3806e3',
...
u'ffffa5fe57d2ef322061513bf60362ff',
u'ffffce218e2b4af7729a4737b8702950',
u'ffffd17a96348904fe49216ba3c7006f'],
dtype='object', name=u'user_id', length=9)
The output that I want is:
# sort by the action_type value 1
action_type 1 2 3
user_id
ffff3c8cf716efa3ae6d3ecfedb2270b 58.0 2.0 0.0
0000110e00f7c85f550b329dc3d76210 31.0 4.0 0.0
0000d8d4ea411b05e0392be855fe9756 19.0 0.0 3.0
00004931fe12d6f678f67e375b3806e3 8.0 4.0 0.0
ffff18540a9567b455bd5645873e56d5 1.0 0.0 0.0
ffffce218e2b4af7729a4737b8702950 1.0 0.0 0.0
ffffd17a96348904fe49216ba3c7006f 1.0 0.0 0.0
0000c2b8660766ed74bafd48599255f0 0.0 2.0 0.0
ffffa5fe57d2ef322061513bf60362ff 0.0 2.0 0.0
[9 rows x 3 columns]
# sort by the action_type value 2
action_type 1 2 3
user_id
00004931fe12d6f678f67e375b3806e3 8.0 4.0 0.0
0000110e00f7c85f550b329dc3d76210 31.0 4.0 0.0
ffff3c8cf716efa3ae6d3ecfedb2270b 58.0 2.0 0.0
0000c2b8660766ed74bafd48599255f0 0.0 2.0 0.0
ffffa5fe57d2ef322061513bf60362ff 0.0 2.0 0.0
0000d8d4ea411b05e0392be855fe9756 19.0 0.0 3.0
ffff18540a9567b455bd5645873e56d5 1.0 0.0 0.0
ffffce218e2b4af7729a4737b8702950 1.0 0.0 0.0
ffffd17a96348904fe49216ba3c7006f 1.0 0.0 0.0
[9 rows x 3 columns]
So, what I want to do is to sort the DataFrame
by the action_type
, that is 1, 2, 3
or the sum of any of them(action_type
sum of 1+2, 1+3, 2+3, 1+2+3
)
The output should sorted by the value of action_type(1, 2 or 3
) of each user or the sum of action_type(for example the sum of action_type 1 and action_type 2, and any combinations, such as the sum of action_type 1 and action_type 3, the sum of action_type 2 and action_type 3, the sum of action_type 1 and action_type 2 and action_type 3) of each user.
For example:
for user id 0000110e00f7c85f550b329dc3d76210
, the value of action_type 1 is 31.0, the value of action_type 2 is 4 and the value of action_type 3 is 3. The sum of action_type 1 and action_type 2 of this user is 31.0 + 4.0 = 35.0
I have tried new_df.sortlevel()
, but it seems it has just sored the dataframe by the user_id
, not by the action_type(1, 2, 3)
How can I do it, thank you!
Upvotes: 1
Views: 279
Reputation: 210832
UPDATE:
If you wanna sort it by columns, just try sort_values
df.sort_values(column_names)
Example:
In [173]: df
Out[173]:
1 2 3
0 6 3 8
1 0 8 0
2 3 8 0
3 5 2 7
4 1 2 1
sort descending by column 2
In [174]: df.sort_values(by=2, ascending=False)
Out[174]:
1 2 3
1 0 8 0
2 3 8 0
0 6 3 8
3 5 2 7
4 1 2 1
sort descending by sum of columns 2+3
In [177]: df.assign(sum=df.loc[:,[2,3]].sum(axis=1)).sort_values('sum', ascending=False)
Out[177]:
1 2 3 sum
0 6 3 8 11
3 5 2 7 9
1 0 8 0 8
2 3 8 0 8
4 1 2 1 3
OLD answer:
if i got you right, you can do it this way:
In [107]: df
Out[107]:
a b c
0 9 1 4
1 0 5 7
2 5 9 8
3 3 9 7
4 1 2 5
In [108]: df.assign(sum=df.sum(axis=1)).sort_values('sum', ascending=True)
Out[108]:
a b c sum
4 1 2 5 8
1 0 5 7 12
0 9 1 4 14
3 3 9 7 19
2 5 9 8 22
Upvotes: 2