Reputation: 25
I am trying to find the difference between 2 dates in a Pandas DataFrame this is my code:
raw['CALCULATED_AGE'] = ((raw.COMMENCEMENT_DATE - raw.DATE_OF_BIRTH))
this gives me the following output:
I just want to convert the days to years, any easy way to do this ?
Thank you so much
Upvotes: 0
Views: 2857
Reputation: 1658
You can use "relativedelta" and match it to your case:
from dateutil.relativedelta import relativedelta
rdelta = relativedelta(raw.COMMENCEMENT_DATE,raw.DATE_OF_BIRTH).years
Full code example:
create the data:
import pandas as pd
from dateutil.relativedelta import relativedelta
raw = pd.DataFrame({'COMMENCEMENT_DATE': ['3/10/2000', '3/11/2000', '3/12/2000'],
'DATE_OF_BIRTH': ['3/10/1990', '3/11/1991', '3/12/1990']})
raw['COMMENCEMENT_DATE'] = pd.to_datetime(raw['COMMENCEMENT_DATE'])
raw['DATE_OF_BIRTH'] = pd.to_datetime(raw['DATE_OF_BIRTH'])
Calc:
raw['CALCULATED_AGE'] = raw.apply(lambda x: relativedelta(x.COMMENCEMENT_DATE, x.DATE_OF_BIRTH).years, axis=1)
Output:
COMMENCEMENT_DATE DATE_OF_BIRTH CALCULATED_AGE
0 2000-03-10 1990-03-10 10
1 2000-03-11 1991-03-11 9
2 2000-03-12 1990-03-12 10
EDIT
Another solution works also for months:
raw['CALCULATED_AGE'] = (raw.COMMENCEMENT_DATE - raw.DATE_OF_BIRTH)/np.timedelta64(1, 'Y')
raw['CALCULATED_AGE'] = raw['CALCULATED_AGE'].astype(int)
If you want calc for months just change 'Y' to 'M'.
Upvotes: 2