user
user

Reputation: 771

How to determine number of months between two dates in Python?

I have two columns that are datetime64[ns] objects. I am trying to determine the number of months between them.

The columns are:

city_clean['last_trip_date']
city_clean['signup_date']

Format is YYYY-MM-DD

I tried

from dateutil.relativedelta import relativedelta

city_clean['months_active'] = relativedelta(city_clean['signup_date'], city_clean['last_trip_date'])

And get the following error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Does anyone know what could cause this issue? I feel like this is the most accurate way to calculate the number of months.

Upvotes: 2

Views: 13375

Answers (3)

Gergely M
Gergely M

Reputation: 743

The first thing that comes to my mind...

>>> from datetime import datetime, timedelta

>>> dt1 = datetime(year=2020, month=3, day=1)
>>> dt2 = datetime(year=2020, month=5, day=1)
>>> # delta = dt2-dt1
>>> delta = abs(dt2-dt1)

>>> delta
datetime.timedelta(61)

>>> delta.days
61

UPDATE: What I meant to represent is the idea of using the absolute value of the delta -> abs()

In Python 3.10 it works with the dateutil.realtivedelta() too



from datetime import datetime
from dateutil.relativedelta import relativedelta


city_clean_dates = [
    {'signup_date': '2019-12-01', 'last_trip_date': '2020-02-01'},
    {'signup_date': '2021-01-01', 'last_trip_date': '2020-05-01'},
    {'signup_date': '2020-03-01', 'last_trip_date': '2020-05-31'},
]

for city_clean in city_clean_dates:
    city_clean['last_trip_date'] = datetime.strptime(city_clean['last_trip_date'], '%Y-%m-%d')
    city_clean['signup_date'] = datetime.strptime(city_clean['signup_date'], '%Y-%m-%d')

    rd1 = abs(relativedelta(city_clean['last_trip_date'], city_clean['signup_date']))
    rd2 = abs(relativedelta(city_clean['signup_date'], city_clean['last_trip_date']))

    assert rd1 == rd2

    print(f"Recent - old date: {rd1}")
    print(f"Old - recent date: {rd2}")

this would print

Recent - old date: relativedelta(months=+2)
Old - recent date: relativedelta(months=+2)
Recent - old date: relativedelta(months=+8)
Old - recent date: relativedelta(months=+8)
Recent - old date: relativedelta(months=+2, days=+30)
Old - recent date: relativedelta(months=+2, days=+30)

Note neither of my solutions returns months, while the first one returns days only, and the second returns whole months + the extra days of the partial month. The ambiguity of this is very obvious in the case of {'last_trip_date': '2020-03-01', 'signup_date': '2020-05-31'}

Where normally we could say that's 3 months but in reality, it's one day short. It's up to the developer to overcome the ambiguity of such values considering the use-case.

Upvotes: 2

thorfi
thorfi

Reputation: 327

You need to extract the property you want from the relativedelta, in this case, .months:

from dateutil.relativedelta import relativedelta

rel = relativedelta(city_clean['signup_date'], city_clean['last_trip_date'])

city_clean['months_active'] = rel.years * 12 + rel.months

Upvotes: 3

wp78de
wp78de

Reputation: 19000

This is Pandas, right? Try it like this:

# calculate the difference between two dates
df['diff_months'] = df['End_date'] - df['Start_date'] 
# converts the difference in terms of Months (timedelta64(1,’M’)-  capital M indicates Months)
df['diff_months']=df['diff_months']/np.timedelta64(1,'M') 

Or, if you have proper datetimes objects,

def diff_month(d1, d2):
    return (d1.year - d2.year) * 12 + d1.month - d2.month

Upvotes: 5

Related Questions