Reputation: 2342
I have data frame in Python Pandas like below:
col_date
-------
2001-01-05
1992-05-06
And I want to use below function to calculate age based of above column in data frame:
def age(born):
"""
Desc.
"""
born = datetime.strptime(born, '%y%m%d').date()
date = "2021-08-01"
return date.year - born.year - ((date.month, date.day) < (born.month, born.day))
When I check values by df.col_date.unique()
I have results like below:
array([datetime.date(2001, 1, 5), datetime.date(1992, 5, 6),dtype=object)
And when I use my function: df["col_date"] = df["col_date"].apply(age)
I have error: TypeError: strptime() argument 1 must be str, not datetime.date
But When I change type from string to datetime and I use function I have error: TypeError: strptime() argument 1 must be str, not Timestamp
because instead of datetime I have timestamp: '2001-01-05T00:00:00.000000000'
What can I do I totaly do not know?
Upvotes: 0
Views: 244
Reputation: 24314
You can try via pd.to_datetime()
:
def age(born):
"""
Desc.
"""
born = pd.to_datetime(born,format='%Y-%m-%d')
date = pd.to_datetime("2021-08-01")
return date.year - born.year - (date.month<born.month and date.day<born.day)
#Finallly:
df["col_date"] = df["col_date"].apply(age)
OR
other way is to caculate your condition directly:
df['col_date']=pd.to_datetime(df['col_date'])
date = pd.to_datetime("2021-08-01")
df['col_date']=(date.year-df['col_date'].dt.year)-((df['col_date'].dt.month.lt(date.month)) & (df['col_date'].dt.day.lt(date.day)))
Upvotes: 1
Reputation: 16942
Use the module dateutil
:
from dateutil import relativedelta
def age(born: datetime.date):
return relativedelta.relativedelta(datetime.date.today(),born)
For a datetime.date
representing 2001-01-05 this will return
relativedelta(years=+20, months=+7, days=+7)
and you can translate that as you like
>>> result = age(datetime.date(2001,1,5))
>>> result.years
20
>>> result.months
7
etc.
Upvotes: 0