Reputation: 515
using Python3 Pandas, I'm trying to calculate RESULT. I keep getting Boolean ambiguous value errors. Do I need to test that each date column I compare with is not null first to avoid an error? The end result should mimic:
#check if D3_UNTIL is not empty
if df.RUNNING_DATE.isna()==False:
if df.D3_UNTIL.isna()==False:
if.RUNNING_DATE >= df.D3_UNTIL:
df.RESULT = df.DVAL3
elif (df.RUNNING_DATE >= df.D2_UNTIL & df.RUNNING_DATE < df.D3_UNTIL):
df.RESULT = df.DVAL2
elif (df.RUNNING_DATE >= df.D1_UNTIL & df.RUNNING_DATE < df.D2_UNTIL):
df.RESULT = df.DVAL1
else None
#check if D2.UNTIL is not empty
elif df.D2_UNTIL.isna()==False:
if.RUNNING_DATE >= df.D2_UNTIL:
df.RESULT = df.DVAL2
elif (df.RUNNING_DATE >= df.D1_UNTIL & df.RUNNING_DATE < df.D2_UNTIL):
df.RESULT = df.DVAL1
else None
#check if D1.UNTIL is not empty
elif df.D1_UNTIL.isna()==False:
if.RUNNING_DATE >= df.D1_UNTIL:
df.RESULT = df.DVAL1
else None
else None
RUNNING_DATE D1_UNTIL DVAL1 D2_UNTIL DVAL2 D3_UNTIL DVAL3 RESULT
1/1/2018 1/1/2018 10 10
1/2/2018
1/3/2018 1/1/2018
1/4/2018 1/1/2018 10 1/3/2018 15
1/5/2018 1/1/2018 10 1/3/2018 20 1/31/2018 100 20
1/6/2018 1/1/2018 10 999
1/7/2018 1/1/2018 10 1/4/2018 25 1/6/2018 300 300
Upvotes: 0
Views: 2866
Reputation: 59519
With if-else statements you can use np.select
to implement your logic. Also checking df.RUNNING_DATE.isna()==False
is superfluous; just use df.RUNNING_DATE.notnull()
.
Further, the logic here can be simplified immensely.
>=
, ==
, or <=
date comparison with NaT
will return
False
, so first checking if the value is finite not needed when already checking if RUNNING_DATE is greater. Also any comparison of NaT
with NaT
returns False
, giving us a default check if RUNNING_DATE is null.>=
sequentially.import pandas as pd
import numpy a np
# Ensure Datetime
#df['RUNNING_DATE'] = pd.to_datetime(df.RUNNING_DATE, errors='coerce')
#df['D1_UNTIL'] = pd.to_datetime(df.D1_UNTIL, errors='coerce')
#df['D2_UNTIL'] = pd.to_datetime(df.D2_UNTIL, errors='coerce')
#df['D3_UNTIL'] = pd.to_datetime(df.D3_UNTIL, errors='coerce')
conds = [
df.RUNNING_DATE >= df.D3_UNTIL,
df.RUNNING_DATE >= df.D2_UNTIL,
df.RUNNING_DATE >= df.D1_UNTIL]
choices = [
df.DVAL3,
df.DVAL2,
df.DVAL1]
df['RESULT'] = np.select(conds, choices, default=None)
(I added extra rows at the end to illustrate logic)
RUNNING_DATE D1_UNTIL DVAL1 D2_UNTIL DVAL2 D3_UNTIL DVAL3 RESULT
0 2018-01-01 2018-01-01 10.0 NaT NaN NaT NaN 10
1 2018-01-02 NaT NaN NaT NaN NaT NaN None
2 2018-01-03 2018-01-01 NaN NaT NaN NaT NaN NaN
3 2018-01-04 2018-01-01 10.0 2018-01-03 15.0 NaT NaN 15
4 2018-01-05 2018-01-01 10.0 2018-01-03 20.0 2018-01-31 100.0 20
5 2018-01-06 2018-01-01 10.0 NaT 999.0 NaT NaN 10
6 2018-01-07 2018-01-01 10.0 2018-01-04 25.0 2018-01-06 300.0 300
7 NaT NaT NaN NaT NaN NaT NaN None
8 NaT NaT NaN 2018-01-01 24.0 NaT NaN None
Upvotes: 2