Reputation: 69
I am trying to create a new column in pandas using an if statement. I have this df:
df = {'Col1': [7,6,-9],
'Col2': [0.5,0.5,0.5],
'Col3': [5,4,3]}
If Col1
is greater than 0, then I'd like to multiply Col2
by Col3
to create the new column, Col4
. If Col1
is not greater than 0, then I'd just like to return 0 as the column value.
Here is what I tried:
df['Col4'] = if df['Col1'] > 0:
df['Col2'] * df['Col3']
else:
0
I get the error: "SyntaxError: invalid syntax"
The final answer should look like this:
df = {'Col1': [7,6,-9],
'Col2': [0.5,0.5,0.5],
'Col3': [5,4,3],
'Col4': [2.5,2,0]}
Note that because in Col1
"-9" is not greater than 0, Col4
should give 0.
Upvotes: 2
Views: 4655
Reputation: 1539
Your syntax is invalid. I think this is closer to what you wanted:
import pandas as pd
df = pd.DataFrame({'Col1': [7, 6, -9],
'Col2': [0.5, 0.5, 0.5],
'Col3': [5, 4, 3]})
print(df)
print()
def product(row):
if row['Col1'] > 0:
return row['Col2'] * row['Col3']
else:
return 0
df['Col4'] = df.apply(product, axis=1)
print(df)
Output:
Col1 Col2 Col3 Col4
0 7 0.5 5 2.5
1 6 0.5 4 2.0
2 -9 0.5 3 0.0
Upvotes: 1
Reputation: 19885
I would use np.where
:
>>> df['Col4'] = np.where(df['Col1'] > 0, df['Col2'] * df['Col3'], 0)
>>> df
Col1 Col2 Col3 Col4
0 7 0.5 5 2.5
1 6 0.5 4 2.0
2 -9 0.5 3 0.0
Basically, where df['Col1']
is more than zero, the corresponding element in Col4
will be df['Col2'] * df['Col3']
. Otherwise, it will be zero.
There's also a pd.DataFrame.where
, which I find a bit more unwieldy:
>>> df['Col4'] = (df['Col2'] * df['Col3']).where(df['Col1'] > 0, 0)
You can see this answer for details.
Upvotes: 3