kirankumar M
kirankumar M

Reputation: 59

Difference Between two dates columns

I have to find difference between two date columns in python dataframe and to compare whether difference is greater than 120 or not

if working_data['CLAIMS_EVENT_DATE'] - working_data['LAST_LAPSED_DATE'] > 120:

I got below Error

invalid_comparison .format(dtype=left.dtype, typ=type(right).name))

TypeError: Invalid comparison between dtype=timedelta64[ns] and int

Upvotes: 0

Views: 79

Answers (2)

Priya
Priya

Reputation: 66

#Convert both columns to datetime format
working_data['CLAIMS_EVENT_DATE'] = pd.to_datetime(working_data['CLAIMS_EVENT_DATE'])
working_data['LAST_LAPSED_DATE'] = pd.to_datetime(working_data['LAST_LAPSED_DATE'])

#Calculate the difference between the days
working_data['Days'] = (working_data['LAST_LAPSED_DATE']                           
                        - working_data['CLAIMS_EVENT_DATE']).days

#Create a column 'Greater' and check whether difference is greater than 120 or not
working_data.loc[working_data.Days <= 120, 'Greater'] = 'False' 
working_data.loc[working_data.Days > 120, 'Greater'] = 'True' 

Upvotes: 0

jezrael
jezrael

Reputation: 863741

If compare both get timedeltas, so for compare are possible 2 solutions - compare days by Series.dt.days with Series.any if need test if at least one value match condition:

s = (working_data['CLAIMS_EVENT_DATE'] - working_data['LAST_LAPSED_DATE'])

if (s.dt.days > 120).any():
    print ('At least one value is higher')

Or compare by Timedelta:

if (s > pd.Timedelta(120, unit='d')).any():
    print ('At least one value is higher')

If need fitler rows use boolean indexing:

df = working_data[s.dt.days > 120]

Or:

df = working_data[s > pd.Timedelta(120, unit='d')]

Upvotes: 1

Related Questions