Tom J Muthirenthi
Tom J Muthirenthi

Reputation: 3330

Month difference YYYYMM Pandas

I had two date columns in the data frame, which was of float type, So I converted it in to date format YYYYMM. Now I have to find the difference of months between them. I tried the below, but I goves error.

df['Date_1'] = pd.to_datetime(df['Date_1'], format = '%Y%m%d').dt.strftime('%Y%m') #Convert float to YYYYMM Format
df['Date_2'] = pd.to_datetime(df['Date_2'], format='%Y%m.0').dt.strftime('%Y%m') #Convert float to YYYYMM Format
df['diff'] = df['Date_1'] - df['Date_2'] #Gives error

Upvotes: 2

Views: 2577

Answers (1)

jezrael
jezrael

Reputation: 862481

I think need subtract periods created byto_period :

df = pd.DataFrame({'Date_1':[20150810, 20160804],
                   'Date_2':[201505.0, 201602.0]})

print (df)
     Date_1    Date_2
0  20150810  201505.0
1  20160804  201602.0

df['Date_1'] = pd.to_datetime(df['Date_1'], format = '%Y%m%d').dt.to_period('m') 
df['Date_2'] = pd.to_datetime(df['Date_2'], format='%Y%m.0').dt.to_period('m') 
df['diff'] = df['Date_1'] - df['Date_2'] 
print (df)
   Date_1  Date_2 diff
0 2015-08 2015-05    3
1 2016-08 2016-02    6

Another solution is convert Date_1 to first day of month:

df['Date_1'] = pd.to_datetime(df['Date_1'], format = '%Y%m%d') - pd.offsets.MonthBegin()
df['Date_2'] = pd.to_datetime(df['Date_2'], format='%Y%m.0')
df['diff'] = df['Date_1'] - df['Date_2'] 
print (df)
      Date_1     Date_2     diff
0 2015-08-01 2015-05-01  92 days
1 2016-08-01 2016-02-01 182 days

Upvotes: 3

Related Questions