kms
kms

Reputation: 2014

Months between two dates - pandas series and datetime.date.today()

I am trying to calculate the number of months between two dates. I am running the operation on a pandas Series.

Sample series:

3645   2014-06-24
3646   2020-11-03
3647   2016-06-28
3648   2017-07-20
3649   2000-03-27
Name: lastSaleDate, Length: 1797, dtype: datetime64[ns]

I'd like to calculate the number of months since this date, i.e. the difference between date in the column and today.

import pandas as pd
import datetime as dt
import numpy as np

# Calc. months since last txn
df['mos'] = ((df.Date - dt.date.today())/np.timedelta64(1, 'M'))

Traceback:

TypeError                                 Traceback (most recent call last)
/var/folders/d0/gnksqzwn2fn46fjgrkp6045c0000gn/T/ipykernel_92607/847000123.py in <module>
      5 
      6 # Calc. months 
----> 7 df['mos'] = ((df.Date - dt.date.today())/np.timedelta64(1, 'M'))

/Applications/Anaconda/anaconda3/lib/python3.9/site-packages/pandas/core/ops/common.py in new_method(self, other)
     67         other = item_from_zerodim(other)
     68 
---> 69         return method(self, other)
     70 
     71     return new_method

/Applications/Anaconda/anaconda3/lib/python3.9/site-packages/pandas/core/arraylike.py in __sub__(self, other)
     98     @unpack_zerodim_and_defer("__sub__")
     99     def __sub__(self, other):
--> 100         return self._arith_method(other, operator.sub)
    101 
    102     @unpack_zerodim_and_defer("__rsub__")

/Applications/Anaconda/anaconda3/lib/python3.9/site-packages/pandas/core/series.py in _arith_method(self, other, op)
   5524 
   5525         with np.errstate(all="ignore"):
-> 5526             result = ops.arithmetic_op(lvalues, rvalues, op)
   5527 
   5528         return self._construct_result(result, name=res_name)

/Applications/Anaconda/anaconda3/lib/python3.9/site-packages/pandas/core/ops/array_ops.py in arithmetic_op(left, right, op)
    216         # Timedelta/Timestamp and other custom scalars are included in the check
    217         # because numexpr will fail on it, see GH#31457
--> 218         res_values = op(left, right)
    219     else:
    220         # TODO we should handle EAs consistently and move this check before the if/else

TypeError: unsupported operand type(s) for -: 'DatetimeArray' and 'datetime.date'

Upvotes: 1

Views: 1572

Answers (1)

user7864386
user7864386

Reputation:

As the error says, the types don't match. You have to either convert datetime.date object to pandas datetime or you could use to_datetime('today') instead:

df['mos'] = df['lastSaleDate'].rsub(pd.to_datetime('today'))/np.timedelta64(1, 'M')

or (more readably)

df['mos'] = (pd.to_datetime('today') - df['lastSaleDate'])/np.timedelta64(1, 'M')

Output:

3645     92.710100
3646     16.355349
3647     68.561760
3648     55.846920
3649    263.621207
Name: dates, dtype: float64

Upvotes: 1

Related Questions