Reputation: 195
I had a dataframe like below:
startdate terminationdate
0 1997-07-13 2004-09-29
1 1999-07-26 2016-03-23
2 2003-04-01 NaT
3 2007-06-01 NaT
4 2009-06-01 NaT
I would like to get the output to calculate the tenure in months. For null value in terminationdate, I would like to use current date to calculate.
I tried the code below:
def tenure(df):
if df['terminationdate'] != np.nan:
tenure = (df['terminationdate'] - df['startdate'])/np.timedelta64(1, 'M')
else:
tenure = (datetime.datetime.now() - df['startdate'])/np.timedelta64(1, 'M')
return tenure
The tenure of NaT value could not be calculated with above code.
Upvotes: 1
Views: 2519
Reputation: 148880
You can just use fillna
to replace the NaT
in terminationdate
with current one:
tenure = (df['terminationdate'].fillna(datetime.datetime.today()) - df['startdate'])/
np.timedelta64(1, 'M')
Upvotes: 2
Reputation: 338
tawab_shakeel's answer is valid. In case you want to keep the null in the dataset and follow your logic, you can use the apply function to each raw
from datetime import datetime
import pandas as pd
import numpy as np
df = pd.DataFrame({'startdate': pd.to_datetime(['1997-07-13', '1999-07-26', '2003-04-01', '2007-06-01', '2009-06-01']),
'terminationdate': pd.to_datetime(['2004-09-29', '2016-03-23', None, None, None])})
def tenure(start_date, end_date):
if pd.isna(end_date) == False:
tenure = (end_date - start_date)/np.timedelta64(1, 'M')
else:
tenure = (datetime.now() - start_date)/np.timedelta64(1, 'M')
return tenure
tenure_month = df.apply(lambda row: tenure(row['startdate'], row['terminationdate']), axis = 1)
Upvotes: 2
Reputation: 3739
Try converting columns into date using pd.to_datetime()
fill current date using np.where and datetime.datetime.now().date
final apply your formula of difference
import pandas as pd
import numpy as np
import datetime
df['startdate'] = pd.to_datetime(df['startdate']).dt.date
df['terminationdate'] = pd.to_datetime(df['terminationdate']).dt.date
df['terminationdate'] = np.where(df['terminationdate'].isnull(),
datetime.datetime.now().date(),
df['terminationdate'])
df['result'] = (df['terminationdate'] - df['startdate'])/np.timedelta64(1, 'M')
Upvotes: 0