Mike Mann
Mike Mann

Reputation: 546

Lambda Apply to find difference between two dates

I'm trying to use apply method with lambda to find the months between two dates. I'm currently getting a attribute error:

AttributeError: 'datetime.date' object has no attribute 'dt'

My upfront conversion:

df['date1'] = pd.to_datetime(df['date1'], errors='ignore', infer_datetime_format=True)
df['date2'] = pd.to_datetime(df['date2'], errors='ignore', infer_datetime_format=True)

Here is my block:

df['Duration (Months)'] = df.apply(lambda x: x["Date1"].dt.to_period('M').astype(int) - x["Date2"].dt.to_period('M').astype(int), axis=1)

Second attempt:

df['Duration (Months)'] = df['date1'].dt.to_period('M').astype(int) - df['date2'].dt.to_period('M').astype(int)

Any thoughts on where I'm going wrong?

Upvotes: 0

Views: 864

Answers (1)

Timeless
Timeless

Reputation: 37747

From the documentation :

Series has an accessor to succinctly return datetime like properties for the values of the Series, if it is a datetime/period like Series. This will return a Series, indexed like the existing Series.

So there is no need to use the .dt accessor when calling pandas.Series.apply because this one access to each element (that is already a datetime) individually. Hence the errors below (depending on the type of your Series) :

AttributeError: 'datetime.date' object has no attribute 'dt'
AttributeError: 'Timestamp' object has no attribute 'dt'

Try this instead :

(df.apply(lambda x: x["date1"].to_period('M') - x["date2"].to_period('M'), axis=1))

Or with a vectorial code :

(df["date1"].dt.to_period('M') - df["date2"].dt.to_period("M")) #here, we needed the .dt accessor

0    <0 * MonthEnds>
1    <-1 * MonthEnd>
2    <6 * MonthEnds>
dtype: object

This will return a pandas.tseries.offsets.DateOffset. Therefore to cast a number/int, you can use operator.attrgetter to get the name as an attribute :

from operator import attrgetter

(df["date1"].dt.to_period('M') - df["date2"].dt.to_period("M")).apply(attrgetter("n"))

0    0
1   -1
2    6
dtype: int64

Used input :

       date1      date2
0 2022-01-13 2022-01-01
1 2022-02-05 2022-03-06
2 2022-10-14 2022-04-09

date1    datetime64[ns]
date2    datetime64[ns]
dtype: object

Upvotes: 1

Related Questions