Anrich
Anrich

Reputation: 25

Converting days to years in Pandas DataFrame

I am trying to find the difference between 2 dates in a Pandas DataFrame this is my code:

raw['CALCULATED_AGE'] = ((raw.COMMENCEMENT_DATE - raw.DATE_OF_BIRTH))

this gives me the following output:

Pandas Output Column

I just want to convert the days to years, any easy way to do this ?

Thank you so much

Upvotes: 0

Views: 2857

Answers (1)

Niv Dudovitch
Niv Dudovitch

Reputation: 1658

You can use "relativedelta" and match it to your case:

from dateutil.relativedelta import relativedelta
rdelta = relativedelta(raw.COMMENCEMENT_DATE,raw.DATE_OF_BIRTH).years

Full code example:

create the data:

import pandas as pd
from dateutil.relativedelta import relativedelta
raw = pd.DataFrame({'COMMENCEMENT_DATE': ['3/10/2000', '3/11/2000', '3/12/2000'],
                   'DATE_OF_BIRTH': ['3/10/1990', '3/11/1991', '3/12/1990']})

raw['COMMENCEMENT_DATE'] = pd.to_datetime(raw['COMMENCEMENT_DATE'])
raw['DATE_OF_BIRTH'] = pd.to_datetime(raw['DATE_OF_BIRTH'])

Calc:

raw['CALCULATED_AGE'] = raw.apply(lambda x: relativedelta(x.COMMENCEMENT_DATE, x.DATE_OF_BIRTH).years, axis=1)

Output:

  COMMENCEMENT_DATE DATE_OF_BIRTH  CALCULATED_AGE
0        2000-03-10    1990-03-10              10
1        2000-03-11    1991-03-11               9
2        2000-03-12    1990-03-12              10

EDIT

Another solution works also for months:

raw['CALCULATED_AGE'] = (raw.COMMENCEMENT_DATE - raw.DATE_OF_BIRTH)/np.timedelta64(1, 'Y')
raw['CALCULATED_AGE'] = raw['CALCULATED_AGE'].astype(int)

If you want calc for months just change 'Y' to 'M'.

Upvotes: 2

Related Questions