Stacknewb
Stacknewb

Reputation: 13

Date difference of two columns on Pandas

The fields 'now', 'EndDate' and 'CreateDate' are dates. With the latest date of EndDate/CreateDate, 'diff_m' outputs the month difference between that field and 'now' (rounded down).

now=dt.now()
df['MaxDate'] = df[['EndDate', 'CreateDate']].max(axis=1)
df['diff_months'] = now - df['MaxDate']
df['diff_months']=df['diff_months']/np.timedelta64(1,'M')
df['diff_m']=df['diff_months'].apply(np.floor)

However, I'm getting the following error:

TypeError unsupported operand type(s) for -: 'datetime.datetime' and 'float'

I see that CreateDate/EndDate are dtype: datetime64[ns]

Upvotes: 0

Views: 42

Answers (1)

Anurag Reddy
Anurag Reddy

Reputation: 1215

The problem lies in the line

df['diff_months'] = now - df['MaxDate']

When you find the max, the type of the newly made column changes and thus you can address that by simply using pd.to_datetime

df['diff_months'] = now - pd.to_datetime(df['MaxDate'])

I tried a small example, here is the result:

In [36]: new_df['temp_date'] = new_df[['date','new_date']].max(axis=1)

In [37]: now = datetime.now()

In [38]: new_df['diff_mon'] = now-new_df['temp_date']

TypeError: unsupported operand type(s) for -: 'datetime.datetime' and 'str'

However, on converting it to datetime

In [39]: new_df['diff_mon'] = now-pd.to_datetime(new_df['temp_date'])

In [40]: new_df['diff_mon']
Out[40]: 
0    7487 days 19:20:37.060114
1    7547 days 19:20:37.060114

Upvotes: 1

Related Questions