HDBrew
HDBrew

Reputation: 69

Create new column in pandas using if statement

I am trying to create a new column in pandas using an if statement. I have this df:

df = {'Col1': [7,6,-9],
      'Col2': [0.5,0.5,0.5],
      'Col3': [5,4,3]}

If Col1 is greater than 0, then I'd like to multiply Col2 by Col3 to create the new column, Col4. If Col1 is not greater than 0, then I'd just like to return 0 as the column value.

Here is what I tried:

df['Col4'] = if df['Col1'] > 0:
    df['Col2'] * df['Col3']
else:
    0  

I get the error: "SyntaxError: invalid syntax"

The final answer should look like this:

df = {'Col1': [7,6,-9],
      'Col2': [0.5,0.5,0.5],
      'Col3': [5,4,3],
      'Col4': [2.5,2,0]}

Note that because in Col1 "-9" is not greater than 0, Col4 should give 0.

Upvotes: 2

Views: 4655

Answers (2)

MarkS
MarkS

Reputation: 1539

Your syntax is invalid. I think this is closer to what you wanted:

import pandas as pd

df = pd.DataFrame({'Col1': [7, 6, -9],
                   'Col2': [0.5, 0.5, 0.5],
                   'Col3': [5, 4, 3]})
print(df)
print()

def product(row):
    if row['Col1'] > 0:
        return row['Col2'] * row['Col3']
    else:
        return 0


df['Col4'] = df.apply(product, axis=1)
print(df)

Output:

   Col1  Col2  Col3  Col4
0     7   0.5     5   2.5
1     6   0.5     4   2.0
2    -9   0.5     3   0.0

Upvotes: 1

gmds
gmds

Reputation: 19885

I would use np.where:

>>> df['Col4'] = np.where(df['Col1'] > 0, df['Col2'] * df['Col3'], 0)                                                   
>>> df
Col1  Col2  Col3  Col4
0     7   0.5     5   2.5
1     6   0.5     4   2.0
2    -9   0.5     3   0.0 

Basically, where df['Col1'] is more than zero, the corresponding element in Col4 will be df['Col2'] * df['Col3']. Otherwise, it will be zero.

There's also a pd.DataFrame.where, which I find a bit more unwieldy:

>>> df['Col4'] = (df['Col2'] * df['Col3']).where(df['Col1'] > 0, 0)

You can see this answer for details.

Upvotes: 3

Related Questions