Messak
Messak

Reputation: 463

sorting panda columns on ascending and then non 0 duplicates

i have a pandas dataframe that i am trying to sort

here it would be unsorted

col1   col2    col3   col4
  2       0       0    0    
  7    1207    3262    3 
  8    1212    3266    1    
  5    1246    3263    4 
  4       0       0    1 
  6    1232    3265    0       

this would be the ideal sort:

This would be the expected result

col1   col2    col3   col4
  6    1232    3265    0    
  2       0      0     0    
  8    1212    3266    1    
  4       0      0     1    
  7    1207    3262    3    
  5    1246    3263    4    

What i want to do is sort by col4, but if there is a duplicate, then use the row that has a valid value (non 0) in col2. the problem is that col2 may not be ascending or descending, just containing a valid value.

The only way i can think of doing this is creating new dataframes, but hopefully there is better way.

@ganderson's method ended up working for me

df.sort_values(['col4', 'col2'], ascending=[True,True], inplace=True)

Upvotes: 0

Views: 39

Answers (2)

BENY
BENY

Reputation: 323226

You may need create a help key here

df['helpkey']=df.col2.eq(0)
df.sort_values(['col4','helpkey']).drop('helpkey',1)
Out[98]: 
   col1  col2  col3  col4
5     6  1232  3265     0
0     2     0     0     0
2     8  1212  3266     1
4     4     0     0     1
1     7  1207  3262     3
3     5  1246  3263     4

Upvotes: 2

G. Anderson
G. Anderson

Reputation: 5955

In the provided example, this looks like

df.sort_values(['col4', 'col2'], ascending=[True,False]))

There is some abiguity in how you want the 2nd level of sort to be applied, but this should produce at least the posted output

Upvotes: 2

Related Questions