Nivi
Nivi

Reputation: 1137

Calculating age from date/time format in python/pandas

Looking for a way to calculate age from the following date/time format in python.

eg: 1956-07-01T00:00:00Z

I have written a code to do this by extracting the four characters of the string, convert it to an int and subtract it from 2017 but was looking to see if there is an efficient way to do it.

Upvotes: 5

Views: 13515

Answers (3)

Keiku
Keiku

Reputation: 8803

If there is an irregular year (e.g. 1601) as below, pd.to_datetime will be an error.

import pandas as pd

(pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)

# Traceback (most recent call last):
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 444, in _convert_listlike
#     values, tz = tslib.datetime_to_datetime64(arg)
#   File "pandas/_libs/tslib.pyx", line 1810, in pandas._libs.tslib.datetime_to_datetime64 (pandas/_libs/tslib.c:33275)
# TypeError: Unrecognized value type: <class 'str'>
# During handling of the above exception, another exception occurred:
# Traceback (most recent call last):
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
#     exec(code_obj, self.user_global_ns, self.user_ns)
#   File "<ipython-input-45-829e219d9060>", line 1, in <module>
#     (pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 518, in to_datetime
#     result = _convert_listlike(np.array([arg]), box, format)[0]
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 447, in _convert_listlike
#     raise e
#   File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 435, in _convert_listlike
#     require_iso8601=require_iso8601
#   File "pandas/_libs/tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:46617)
#   File "pandas/_libs/tslib.pyx", line 2538, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:45511)
#   File "pandas/_libs/tslib.pyx", line 2506, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44978)
#   File "pandas/_libs/tslib.pyx", line 2500, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44859)
#   File "pandas/_libs/tslib.pyx", line 1517, in pandas._libs.tslib.convert_to_tsobject (pandas/_libs/tslib.c:28598)
#   File "pandas/_libs/tslib.pyx", line 1774, in pandas._libs.tslib._check_dts_bounds (pandas/_libs/tslib.c:32752)
# pandas._libs.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1601-07-01 00:00:00

For data including irregular years, you can calculate as follows.

import numpy as np
import pandas as pd

date = pd.Series(['1601-07-01', '1956-07-01'])

def elasped_years(date):
    reference_year = pd.to_datetime('today').year
    reference_month = pd.to_datetime('today').month
    year = date.str.slice(0, 4).astype(np.float)
    month = date.str.slice(5, 7).astype(np.float)
    duration = np.floor((12 * (reference_year - year) + (reference_month - month)) / 12)
    return(duration)

elasped_years(date)
# Out[46]: 
# 0    416.0
# 1     61.0
# dtype: float64

Upvotes: 0

piRSquared
piRSquared

Reputation: 294218

I'd divide the number of days via the timedelta object by 365.25

(pd.to_datetime('today') - pd.to_datetime('1956-07-01')).days / 365.25

61.24845995893224

Upvotes: 3

BENY
BENY

Reputation: 323226

Is this what you want ?

(pd.to_datetime('today').year-pd.to_datetime('1956-07-01').year)

Out[83]: 61

Upvotes: 10

Related Questions