Reputation: 463
i have a pandas dataframe that i am trying to sort
here it would be unsorted
col1 col2 col3 col4
2 0 0 0
7 1207 3262 3
8 1212 3266 1
5 1246 3263 4
4 0 0 1
6 1232 3265 0
this would be the ideal sort:
This would be the expected result
col1 col2 col3 col4
6 1232 3265 0
2 0 0 0
8 1212 3266 1
4 0 0 1
7 1207 3262 3
5 1246 3263 4
What i want to do is sort by col4, but if there is a duplicate, then use the row that has a valid value (non 0) in col2. the problem is that col2 may not be ascending or descending, just containing a valid value.
The only way i can think of doing this is creating new dataframes, but hopefully there is better way.
@ganderson's method ended up working for me
df.sort_values(['col4', 'col2'], ascending=[True,True], inplace=True)
Upvotes: 0
Views: 39
Reputation: 323226
You may need create a help key here
df['helpkey']=df.col2.eq(0)
df.sort_values(['col4','helpkey']).drop('helpkey',1)
Out[98]:
col1 col2 col3 col4
5 6 1232 3265 0
0 2 0 0 0
2 8 1212 3266 1
4 4 0 0 1
1 7 1207 3262 3
3 5 1246 3263 4
Upvotes: 2
Reputation: 5955
In the provided example, this looks like
df.sort_values(['col4', 'col2'], ascending=[True,False]))
There is some abiguity in how you want the 2nd level of sort to be applied, but this should produce at least the posted output
Upvotes: 2