Reputation: 1137
Looking for a way to calculate age from the following date/time format in python.
eg: 1956-07-01T00:00:00Z
I have written a code to do this by extracting the four characters of the string, convert it to an int and subtract it from 2017 but was looking to see if there is an efficient way to do it.
Upvotes: 5
Views: 13515
Reputation: 8803
If there is an irregular year (e.g. 1601) as below, pd.to_datetime
will be an error.
import pandas as pd
(pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)
# Traceback (most recent call last):
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 444, in _convert_listlike
# values, tz = tslib.datetime_to_datetime64(arg)
# File "pandas/_libs/tslib.pyx", line 1810, in pandas._libs.tslib.datetime_to_datetime64 (pandas/_libs/tslib.c:33275)
# TypeError: Unrecognized value type: <class 'str'>
# During handling of the above exception, another exception occurred:
# Traceback (most recent call last):
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
# exec(code_obj, self.user_global_ns, self.user_ns)
# File "<ipython-input-45-829e219d9060>", line 1, in <module>
# (pd.to_datetime('today').year-pd.to_datetime('1601-07-01').year)
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 518, in to_datetime
# result = _convert_listlike(np.array([arg]), box, format)[0]
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 447, in _convert_listlike
# raise e
# File "/home/kuroyanagi/.pyenv/versions/anaconda3-4.4.0/lib/python3.6/site-packages/pandas/core/tools/datetimes.py", line 435, in _convert_listlike
# require_iso8601=require_iso8601
# File "pandas/_libs/tslib.pyx", line 2355, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:46617)
# File "pandas/_libs/tslib.pyx", line 2538, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:45511)
# File "pandas/_libs/tslib.pyx", line 2506, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44978)
# File "pandas/_libs/tslib.pyx", line 2500, in pandas._libs.tslib.array_to_datetime (pandas/_libs/tslib.c:44859)
# File "pandas/_libs/tslib.pyx", line 1517, in pandas._libs.tslib.convert_to_tsobject (pandas/_libs/tslib.c:28598)
# File "pandas/_libs/tslib.pyx", line 1774, in pandas._libs.tslib._check_dts_bounds (pandas/_libs/tslib.c:32752)
# pandas._libs.tslib.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1601-07-01 00:00:00
For data including irregular years, you can calculate as follows.
import numpy as np
import pandas as pd
date = pd.Series(['1601-07-01', '1956-07-01'])
def elasped_years(date):
reference_year = pd.to_datetime('today').year
reference_month = pd.to_datetime('today').month
year = date.str.slice(0, 4).astype(np.float)
month = date.str.slice(5, 7).astype(np.float)
duration = np.floor((12 * (reference_year - year) + (reference_month - month)) / 12)
return(duration)
elasped_years(date)
# Out[46]:
# 0 416.0
# 1 61.0
# dtype: float64
Upvotes: 0
Reputation: 294218
I'd divide the number of days via the timedelta object by 365.25
(pd.to_datetime('today') - pd.to_datetime('1956-07-01')).days / 365.25
61.24845995893224
Upvotes: 3
Reputation: 323226
Is this what you want ?
(pd.to_datetime('today').year-pd.to_datetime('1956-07-01').year)
Out[83]: 61
Upvotes: 10