Merging two csv files on the key in column 1

Question

I am trying to merge (combine) using an outer join such that the result contains lines with the id in column 0 and all columns from both files. My files contain headings on the first line.

I have tried lots of variations but I continue to get errors complaining about the key. While there are many examples in stackoverflow, none give answers about the underlying methodology to be used.

The files have headings with the first column header = 'Code' and the key field is in fact 5 digits. I am not sure if that is causing me problems.

df1 = pd.read_csv('file1.csv', header=[0], index_col=['Code'])
df2 = pd.read_csv('file2.csv', header=[0], index_col=['Code'])

and I have tried

df1 = pd.read_csv('file1.csv', header=[0])
df2 = pd.read_csv('file2.csv', header=[0])

I have tried variations of...

dfx = pd.merge(df1, df2, left_on=['Code'], right_on=['Code'], how='outer')
dfx = df1[['Code','A-Score']].merge(df2[['Code','B-Score']], how='outer')
df1.merge(df2, on=['Code'], how='outer')
df  = pd.merge(df1[['Code', 'Field1', 'Field2']], df2['Code', 'Field3', 'Field4'], on='Code', how='outer', suffixes=('-A','-B'))
dfx = pd.concat([df1,df2], axis=1, join='outer')

I want all rows from both files to be combined into one file. There are no duplicate keys in either file.

So I just want to perform a quite simple merge of the two files, and understand what parameters are required and where/why.

Postedit: My problem is that the key is interpreted as numeric as I can merge on a string key. So,

How do I 1. Override the key as being string and not numeric? 2. How do I specify the key as int64?

Merging two csv files on the key in column 1

Answers (1)

Related Questions