Pandas rank subset of rows based on condition column

Question

I want to rank the below dataframe by score, only for rows wherecondition is False. The rest should have a rank of NaN.

df=pd.DataFrame(np.array([[34, 65, 12, 98, 5],[False, False, True, False, False]]).T, index=['A', 'B','C','D','E'], columns=['score', 'condition'])

The desired output with the (descending) conditional rank would be:

   score  condition  cond_rank
A     34          0     3 
B     65          0     2
C     12          1    NaN
D     98          0     1
E      5          0     4

I know pd.DataFrame.rank() can handle NaN for the values that are being ranked, but in cases where the conditioning is intended on another column/series, what is the most efficient way to achieve this?

jezrael · Accepted Answer

You can filter by condition column rank:

df['new'] = df.loc[~df['condition'].astype(bool), 'score'].rank()
print (df)
   score  condition  new
A     34          0  2.0
B     65          0  3.0
C     12          1  NaN
D     98          0  4.0
E      5          0  1.0

Pandas rank subset of rows based on condition column

Answers (2)

Related Questions