Reputation: 17894
My dataframe has two columns. When I subtract them to get the month in between, I got some weird numbers. Here is an example:
test = pd.DataFrame({'reg_date': [datetime(2017,3,1), datetime(2016,9,1)],
'leave_date':[datetime(2017,7,1), datetime(2017,6,1)]})
test['diff_month'] = test.leave_date.dt.month - test.reg_date.dt.month
test
The output:
If a user's register_date is last year, I get a negative number (also incorrect as well).
What operations should I perform to get the correct time difference in month between two datetime column?
Update: I changed the example a bit so it reflects more about the issue I am facing. Don't down vote so fast guys.
A hack I did to fix this is:
test['real_diff'] = test.diff_month.apply(lambda x: x if x > 0 else 12+x)
I don't like the hack so I am curious if there is any other way of doing it.
Upvotes: 1
Views: 3400
Reputation: 393933
IIUC you can call apply
and use relativedelta
as @zipa suggested:
In[29]:
from dateutil import relativedelta
test['real_diff'] = test.apply(lambda row: relativedelta.relativedelta(row['leave_date'], row['reg_date']).months, axis=1)
test
Out[29]:
leave_date reg_date real_diff
0 2017-07-01 2017-03-01 4
1 2017-06-01 2016-09-01 9
Upvotes: 2
Reputation: 27869
To get your result you can use relativedelta
from dateutil
:
import datetime
from dateutil import relativedelta
a = datetime.datetime(2016, 12, 1)
b = datetime.datetime(2017, 5, 1)
relativedelta.relativedelta(b, a).months
#5
Upvotes: 2