Reputation: 2253
I have a df
named data
as follows:
id upper_ci lower_ci max_power_contractual
0 12858 60.19878860406808 49.827481214215204 0
1 12858 60.61189293066522 49.298784196530896 0
2 12858 60.34397624424309 49.718421137642885 70
3 12858 59.87472261936114 49.464255779713476 10
4 12858 60.2735279368527 49.41672240525131 0
I am trying to create a new column named up_threshold
as follows:
max_power_contractual
is either zero (0
) or NaN
, then the value in the up_threshold
should be the value in the upper_ci
max_power_contractual
is not zero, and the condition: max_power_contractual > upper_ci
is True
, then the value in the up_threshold
should be the value in the upper_ci
max_power_contractual < upper_ci
is True
, then the value in the up_threshold
should be the value in the max_power_contractual
I tried:
if (data['max_power_contractual'] in (0, np.nan)) or (data['max_power_contractual'] > data['upper_ci']):
data['up_threshold'] = data['upper_ci']
elif (data['upper_ci'] > data['max_power_contractual'] == 0):
data['up_threshold'] = data['max_power_contractual']
But it gives me the following error:
Traceback (most recent call last):
File "/home/cortex/.config/spyder-py3/temp.py", line 179, in data = cp_detection(data, threshold)
File "/home/cortex/.config/spyder-py3/temp.py", line 146, in cp_detection if data['max_power_contractual'] == 0:
File "/home/cortex/.local/lib/python3.7/site-packages/pandas/core/generic.py", line 1479, in nonzero f"The truth value of a {type(self).name} is ambiguous. "
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Can someone please tell me my mistake and how can I solve it?
Expected output:
id upper_ci lower_ci max_power_contractual up_threshold
0 12858 60.19878860406808 49.827481214215204 0 60.19878860406808 (Since `max_power_contractual` value is 0)
1 12858 60.61189293066522 49.298784196530896 NaN 60.61189293066522 (Since `max_power_contractual` value is NaN)
2 12858 60.34397624424309 49.718421137642885 70 60.34397624424309 (Since `upper_ci < max_power_contractual`)
3 12858 59.87472261936114 49.464255779713476 10 10 (Since `upper_ci > max_power_contractual`)
Upvotes: 0
Views: 42
Reputation: 17368
Not the efficient way but easier to understand
In [17]: def process(data):
...: result = None
...: if (data['max_power_contractual'] in (0, np.nan)) or (data['max_power_contractual'] > data['upper_ci']):
...: result = data['upper_ci']
...: elif (data['upper_ci'] > data['max_power_contractual']):
...: result = data['max_power_contractual']
...:
...: return result
...:
In [18]: df.apply(process, axis=1)
Out[18]:
0 60.198789
1 60.611893
2 60.343976
3 10.000000
4 60.273528
dtype: float64
In [19]: df["up_threshold"] = df.apply(process, axis=1)
In [20]: df
Out[20]:
id upper_ci lower_ci max_power_contractual up_threshold
0 12858 60.198789 49.827481 0 60.198789
1 12858 60.611893 49.298784 0 60.611893
2 12858 60.343976 49.718421 70 60.343976
3 12858 59.874723 49.464256 10 10.000000
4 12858 60.273528 49.416722 0 60.273528
Upvotes: 0
Reputation: 5037
Use np.select
import numpy as np
m1 = df.max_power_contractual.isin([np.NaN, 0])
m2 = df.max_power_contractual > df.upper_ci
df['up_threshold'] = np.select([m1, m2], [df.upper_ci, df.upper_ci], default=df.max_power_contractual)
print(df)
Output
id upper_ci lower_ci max_power_contractual up_threshold
0 12858 60.198789 49.827481 0 60.198789
1 12858 60.611893 49.298784 0 60.611893
2 12858 60.343976 49.718421 70 60.343976
3 12858 59.874723 49.464256 10 10.000000
4 12858 60.273528 49.416722 0 60.273528
Since we're using the same columns as choices, we can also use
df['up_threshold'] = np.where((m1 | m2), df.upper_ci, df.max_power_contractual)
Upvotes: 1
Reputation: 195418
You can use np.where
to add the new column:
df['up_threshold'] = np.where(df['max_power_contractual'].fillna(0) == 0, df['upper_ci'],
np.where(df['max_power_contractual'] > df['upper_ci'], df['upper_ci'], df['max_power_contractual'])
)
print(df)
Prints:
id upper_ci lower_ci max_power_contractual up_threshold
0 12858 60.198789 49.827481 0.0 60.198789
1 12858 60.611893 49.298784 NaN 60.611893
2 12858 60.343976 49.718421 70.0 60.343976
3 12858 59.874723 49.464256 10.0 10.000000
4 12858 60.273528 49.416722 0.0 60.273528
Upvotes: 2