Peter Lucas
Peter Lucas

Reputation: 1991

Currency/Date datframe merge failing

Having issues with merging two datframes (xrate and df) based on currency_str and created_date_time

display(xrate.info())

Int64Index: 1611 entries, 6 to 112
Data columns (total 3 columns):
Date        1611 non-null datetime64[ns]
PX_LAST     1611 non-null object
Currency    1611 non-null object

display(xrate.head(3))

Date       PX_LAST  Currency
2018-05-30  1       CAD
2018-05-29  1       CAD
2018-05-28  1       CAD

I created a new date to merge on:

#df['formatted_created_date_time'] = df['created_date_time'].dt.strftime('%d%m%Y')
df['formatted_created_date_time'] = df['created_date_time'].dt.strftime('%d-%m-%Y')
#convert to date
#df['formatted_created_date_time'] = pd.to_datetime(df['formatted_created_date_time'], format='%d%m%Y')
df['formatted_created_date_time'] = pd.to_datetime(df['formatted_created_date_time'], format='%d-%m-%Y')


display(df.info())

RangeIndex: 3488 entries, 0 to 3487
Data columns (total 43 columns):
created_date_time              3488 non-null datetime64[ns]
rfq_create_date_time           3488 non-null datetime64[ns]
currency_str                   3488 non-null object

display(df.head(3))

dataframe image

Now the two dataframes are merged:

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], right_on=['Currency', 'Date'], how='left')

display(result.info())

RangeIndex: 3488 entries, 0 to 3487
Data columns (total 43 columns):
created_date_time              3488 non-null datetime64[ns]
rfq_create_date_time           3488 non-null datetime64[ns]
.
.
formatted_created_date_time    3488 non-null datetime64[ns]

The match has failed:

display(result.head(3))

enter image description here

System Datetime:

enter image description here

Any ideas on this one?

Upvotes: 1

Views: 53

Answers (1)

jezrael
jezrael

Reputation: 862441

It should working nice.

But another solution is merge by strings:

df['formatted_created_date_time'] = df['created_date_time'].dt.strftime('%d-%m-%Y')
xrate['Date'] = xrate['Date'].dt.strftime('%d-%m-%Y')

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], 
                             right_on=['Currency', 'Date'], how='left')

Your solution should be simplify by floor or date

df['formatted_created_date_time'] = df['created_date_time'].dt.floor('d')
xrate['Date'] = xrate['Date'].dt.floor('d')

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], 
                             right_on=['Currency', 'Date'], how='left')
df['formatted_created_date_time'] = df['created_date_time'].dt.date
xrate['Date'] = xrate['Date'].dt.date

result = pd.merge(df, xrate, left_on=['currency_str', 'formatted_created_date_time'], 
                             right_on=['Currency', 'Date'], how='left')

Upvotes: 1

Related Questions