Pandas conditional creation of a dataframe column: based on multiple conditions

Question

I have a df:

  col1 col2 col3
0    1    2    3
1    2    3    1
2    3    3    3
3    4    3    2

I want to add a new column based on the following conditions:

 - if   col1 > col2 > col3   ----->  2
 - elif col1 > col2          ----->  1
 - elif col1 < col2 < col3   -----> -2
 - elif col1 < col2          -----> -1
 - else                      ----->  0

And it should become this:

  col1 col2 col3   new
0    1    2    3   -2
1    2    3    1   -1
2    3    3    3    0
3    4    3    2    2

I followed the method from this post by unutbu, with 1 greater than or less than is fine. But in my case with more than 1 greater than or less than, conditions returns error:

conditions = [
       (df['col1'] > df['col2'] > df['col3']), 
       (df['col1'] > df['col2']),
       (df['col1'] < df['col2'] < df['col3']),
       (df['col1'] < df['col2'])]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)


Traceback (most recent call last):

  File "", line 2, in 
    (df['col1'] > df['col2'] > df['col3']),

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 1478, in __nonzero__
    .format(self.__class__.__name__))

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How should I do this?

BENY · Accepted Answer

Change your code to

conditions = [
       (df['col1'] > df['col2']) &  (df['col2'] > df['col3']), 
       (df['col1'] > df['col2']),
       (df['col1'] < df['col2']) & (df['col2'] < df['col3']),
       (df['col1'] < df['col2'])]
choices = [2,1,-2,-1]
df['new'] = np.select(conditions, choices, default=0)

Pandas conditional creation of a dataframe column: based on multiple conditions

Answers (2)

Related Questions