Reputation: 542
I have a dataframe, with two columns. I am trying to create a third column based on the numbers inside the dataframe. If the number in column b is positive, I want column C to equal column a * b
If the number in column b is negative, I want column c to equal column a * b * 0.95.
an example of what I am trying to get at:
col_a col_b col_c
100. 1. 100
100. -1. -95
100. 10. 1000
100. -10. -950
I have currently tried this:
def profit_calculation(value):
if value<0:
return(a * b * 0.95)
else:
return(a * b)
df['col_c']=df['col_b'].apply(profit_calculation)
But this seems to be incorrect.
Upvotes: 0
Views: 103
Reputation: 1
You can use a lambda function to create new data based on data in the dataframe(df) See explanation of lambda functions here => https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html It takes in parameter a row in the dataframe and return the update made So for each row we call profit_calculation and we give it the data corresponding to the row in parameter. So you have to replace by
def profit_calculation(value):
return value["col_b"]*value["col_a"] if value["col_b"] > 0 else value["col_b"]*value["col_a"]*.95
df['col_c']=df.apply(lambda value: profit_calculation(value), axis=1)
Upvotes: 0
Reputation: 13821
You can use np.where
and check whether column b is greater than 0 using gt
:
import numpy as np
import pandas as pd
a_b = df.col_a.mul(df.col_b)
df['col_c'] = np.where(df['col_b'].gt(0), a_b, a_b.mul(0.95))
which prints:
>>> df
col_a col_b col_c
0 100 1 100.0
1 100 -1 -95.0
2 100 10 1000.0
3 100 -10 -950.0
Upvotes: 1
Reputation: 6642
df = pd.DataFrame({"a": [100, 100, 100, 100],
"b": [1, -1, 10, -10]})
df.a * df.b * (1 - 0.05 * (df.b < 0))
# out:
0 100.0
1 -95.0
2 1000.0
3 -950.0
Explanation: When multiplied with the float 0.05 the boolean Series (df.b < 0)
is cast to integers (True=1, False=0) and therefore we subtract 0.05 from 1 in all instances of negative b, hence obtaining 0.95 when we need it.
Upvotes: 1