Nick
Nick

Reputation: 3134

Difference in months between two dataframe columns

I have two date columns ('dob', 'paymentDate'):

{'dob': {0: nan, 1: '10/16/1988', 2: nan, 3: nan, 4: nan},
'paymentDate': {0: '20120501',
1: '20100602',
2: '20110601',
3: '20120501',
4: '20110101'}}

I want to find the time delta in months between these. I am interested only in completed months (essentially the "floor" of the timedelta in months). In this particular case the result for the second row should be 259 (21 years, 7 months and 17 days = 21*12 + 7 = 259 months):

{'difference_in_months': {0: nan, 1: 259, 2: nan, 3: nan, 4: nan}}

How can I do this? I tried

from dateutil import relativedelta    
df_training_data['difference_in_months'] = relativedelta.relativedelta(df_training_data['paymentDate'], df_training_data['dob']).months

but I'm getting

ValueError: The truth value of a Series is ambiguous

Upvotes: 0

Views: 47

Answers (1)

BENY
BENY

Reputation: 323226

You can using to_datetime and to_period

df=df.apply(pd.to_datetime,errors = 'coerce',axis=1)
df.paymentDate.dt.to_period('M')-df.dob.dt.to_period('M')-1
Out[95]: 
0    NaT
1    259
2    NaT
3    NaT
4    NaT
dtype: object

Upvotes: 2

Related Questions