user16304089
user16304089

Reputation:

Pandas If Else condition on multiple columns

I have a df as:

df:
        col1       col2       col3        col4        col5
0       1.36       4.31       7.66           2           2
1       2.62       3.30       2.48           2           1
2       5.19       3.58       1.62           0           2
3       2.06       3.16       3.50           1           1
4       2.19       2.98       3.38           1           1

I want

col6 to return 1 when (col4 > 1 and col5 > 1) else 0

and

col7 to return 1 when (col4 > 1 and col5 > 1 and col 4 + col5 > 2) else 0

I am trying

df.loc[df['col4'] > 0, df['col5'] > 0, 'col6'] = '1'

however I am getting the error:

File "pandas\_libs\index.pyx", line 269, in pandas._libs.index.IndexEngine.get_indexer
  File "pandas\_libs\hashtable_class_helper.pxi", line 5247, in pandas._libs.hashtable.PyObjectHashTable.lookup
TypeError: unhashable type: 'Series'

How can I perform this operation?

Upvotes: 4

Views: 395

Answers (5)

sammywemmy
sammywemmy

Reputation: 28729

Create a variable to hold the columns :

columns = ['col4', 'col5']

Condition 1 :

cond1 = df.filter(columns).gt(1).all(1)

Condition 2:

cond2 = df.filter(columns).sum(1).gt(2)

Create new columns via assign:

df.assign(col6 = cond1.astype(int), 
          col7 = (cond1 & cond2).astype(int)
          )


   col1  col2  col3  col4  col5  col6  col7
0  1.36  4.31  7.66     2     2     1     1
1  2.62  3.30  2.48     2     1     0     0
2  5.19  3.58  1.62     0     2     0     0
3  2.06  3.16  3.50     1     1     0     0
4  2.19  2.98  3.38     1     1     0     0

Upvotes: 1

BENY
BENY

Reputation: 323386

We can do something like eval

df['col6'] = df.eval('col4 > 1 and col5 > 1').astype(int)
df['col7'] = df.eval('col4 > 1 and col5 > 1 and col4 + col5 > 2').astype(int)

Upvotes: 2

Mad Physicist
Mad Physicist

Reputation: 114558

When you do a bitwise operation on Series objects or arrays, you get an array of booleans, each of whose elements is True or False. Those are basically 0 or 1, and in fact more convenient in most cases:

df['col6'] = (df['col4'] > 1) & (df['col5'] > 1)
df['col7'] = df['col6']

That last one is not a clever trick. If two numbers are both >1, then of course their sum must be >2. If you absolutely want the integers 0 and 1 instead of booleans, use Series.astype:

df['col6'] = ((df['col4'] > 1) & (df['col5'] > 1)).astype(int)

Upvotes: 2

not_speshal
not_speshal

Reputation: 23166

You can simple use bitwise operators:

df['col6'] = ((df["col4"]>1) & (df["col5"]>1))*1
df['col7'] = ((df["col4"]>1) & (df["col5"]>1) & (df['col4']+df['col5']>2))*1

>>> df
   col1  col2  col3  col4  col5  col6  col7
0  1.36  4.31  7.66     2     2     1     1
1  2.62  3.30  2.48     2     1     0     0
2  5.19  3.58  1.62     0     2     0     0
3  2.06  3.16  3.50     1     1     0     0
4  2.19  2.98  3.38     1     1     0     0

Upvotes: 1

Anurag Dabas
Anurag Dabas

Reputation: 24322

try:

c1=df['col4'].gt(1) & df['col5'].gt(1)
#your 1st condition
c2=c1 & df['col4'].add(df['col5']).gt(2)
#your 2nd condition

Finally:

df['col6']=c1.astype(int)
df['col7']=c2.astype(int)

OR

via numpy's where() method:

c1=df['col4'].gt(1) & df['col5'].gt(1)
c2=c1 & df['col4'].add(df['col5']).gt(2)


df['col6']=np.where(c1,1,0)
df['col7']=np.where(c2,1,0)

Upvotes: 1

Related Questions