Chris
Chris

Reputation: 515

Python Pandas compare date columns, check if not empty, conditional > <= logic, return value

using Python3 Pandas, I'm trying to calculate RESULT. I keep getting Boolean ambiguous value errors. Do I need to test that each date column I compare with is not null first to avoid an error? The end result should mimic:

#check if D3_UNTIL is not empty
if df.RUNNING_DATE.isna()==False:
    if df.D3_UNTIL.isna()==False:
        if.RUNNING_DATE >= df.D3_UNTIL:
            df.RESULT = df.DVAL3
        elif (df.RUNNING_DATE >= df.D2_UNTIL & df.RUNNING_DATE < df.D3_UNTIL):
            df.RESULT = df.DVAL2    
        elif (df.RUNNING_DATE >= df.D1_UNTIL & df.RUNNING_DATE < df.D2_UNTIL):
            df.RESULT = df.DVAL1
        else None
#check if D2.UNTIL is not empty
    elif df.D2_UNTIL.isna()==False:
        if.RUNNING_DATE >= df.D2_UNTIL:
            df.RESULT = df.DVAL2
        elif (df.RUNNING_DATE >= df.D1_UNTIL & df.RUNNING_DATE < df.D2_UNTIL):
            df.RESULT = df.DVAL1    
        else None
#check if D1.UNTIL is not empty    
    elif df.D1_UNTIL.isna()==False:
        if.RUNNING_DATE >= df.D1_UNTIL:
            df.RESULT = df.DVAL1
        else None
else None



RUNNING_DATE  D1_UNTIL  DVAL1  D2_UNTIL  DVAL2  D3_UNTIL  DVAL3  RESULT
1/1/2018      1/1/2018  10                                       10             
1/2/2018                                
1/3/2018      1/1/2018                          
1/4/2018      1/1/2018  10     1/3/2018  15             
1/5/2018      1/1/2018  10     1/3/2018  20     1/31/2018 100    20 
1/6/2018      1/1/2018  10               999                
1/7/2018      1/1/2018  10     1/4/2018  25     1/6/2018  300    300    

Upvotes: 0

Views: 2866

Answers (1)

ALollz
ALollz

Reputation: 59519

With if-else statements you can use np.select to implement your logic. Also checking df.RUNNING_DATE.isna()==False is superfluous; just use df.RUNNING_DATE.notnull().

Further, the logic here can be simplified immensely.

  • Any >=, ==, or <= date comparison with NaT will return False, so first checking if the value is finite not needed when already checking if RUNNING_DATE is greater. Also any comparison of NaT with NaT returns False, giving us a default check if RUNNING_DATE is null.
  • Since your date checks span the entire range of possibilities, just check >= sequentially.

Code

import pandas as pd
import numpy a np

# Ensure Datetime
#df['RUNNING_DATE'] = pd.to_datetime(df.RUNNING_DATE, errors='coerce')
#df['D1_UNTIL'] = pd.to_datetime(df.D1_UNTIL, errors='coerce')
#df['D2_UNTIL'] = pd.to_datetime(df.D2_UNTIL, errors='coerce')
#df['D3_UNTIL'] = pd.to_datetime(df.D3_UNTIL, errors='coerce')

conds = [
    df.RUNNING_DATE >= df.D3_UNTIL,
    df.RUNNING_DATE >= df.D2_UNTIL,
    df.RUNNING_DATE >= df.D1_UNTIL]

choices = [
    df.DVAL3,
    df.DVAL2,
    df.DVAL1]

df['RESULT'] = np.select(conds, choices, default=None)

Output:

(I added extra rows at the end to illustrate logic)

  RUNNING_DATE   D1_UNTIL  DVAL1   D2_UNTIL  DVAL2   D3_UNTIL  DVAL3 RESULT
0   2018-01-01 2018-01-01   10.0        NaT    NaN        NaT    NaN     10
1   2018-01-02        NaT    NaN        NaT    NaN        NaT    NaN   None
2   2018-01-03 2018-01-01    NaN        NaT    NaN        NaT    NaN    NaN
3   2018-01-04 2018-01-01   10.0 2018-01-03   15.0        NaT    NaN     15
4   2018-01-05 2018-01-01   10.0 2018-01-03   20.0 2018-01-31  100.0     20
5   2018-01-06 2018-01-01   10.0        NaT  999.0        NaT    NaN     10
6   2018-01-07 2018-01-01   10.0 2018-01-04   25.0 2018-01-06  300.0    300
7          NaT        NaT    NaN        NaT    NaN        NaT    NaN   None
8          NaT        NaT    NaN 2018-01-01   24.0        NaT    NaN   None

Upvotes: 2

Related Questions