Reputation: 33
Despite multiple attempts I am not succeeding in doing a simple merge operation on two dataframes. Below code returns
KeyError: 'CODE'
on the merge function.
Note 1: To make the post reproducible, StringIO is used here with only two lines within each CSV, but in real life I read from files with thousands of records.
Note 2: Notice the trailing ',' (separator) at the end of each line: my CSV files are badly formatted but this is how actual files are.
Note 3: I am using Python 2.7
from StringIO import StringIO
import pandas as pd
master = StringIO("""N-NUMBER,SERIAL NUMBER,MFR MDL CODE,ENG MFR MDL,YEAR MFR,TYPE REGISTRANT,NAME,STREET,STREET2,CITY,STATE,ZIP CODE,REGION,COUNTY,COUNTRY,LAST ACTION DATE,CERT ISSUE DATE,CERTIFICATION,TYPE AIRCRAFT,TYPE ENGINE,STATUS CODE,MODE S CODE,FRACT OWNER,AIR WORTH DATE,OTHER NAMES(1),OTHER NAMES(2),OTHER NAMES(3),OTHER NAMES(4),OTHER NAMES(5),EXPIRATION DATE,UNIQUE ID,KIT MFR, KIT MODEL,MODE S CODE HEX,
1 ,1071 ,3980115,54556,1988,5,FEDERAL AVIATION ADMINISTRATION ,WASHINGTON REAGAN NATIONAL ARPT ,3201 THOMAS AVE HANGAR 6 ,WASHINGTON ,DC,20001 ,1,001,US,20160614,19900214,1T ,5,5 ,V ,50000001, ,19880909, , , , , ,20191130,00524101, , ,A00001 ,""")
mfr = StringIO("""CODE,MFR,MODEL,TYPE-ACFT,TYPE-ENG,AC-CAT,BUILD-CERT-IND,NO-ENG,NO-SEATS,AC-WEIGHT,SPEED,
3980115,EXLINE ACE-C ,ACE-C ,4,1 ,1,1,01,001,CLASS 1,0082,""")
masterdf = pd.read_csv(master,sep=",",index_col=False)
mfrdf = pd.read_csv(mfr,sep=",",index_col=False)
masterdf.merge(mfrdataframe,left_on='MFR MDL CODE',right_on='CODE', how='inner')
Upvotes: 1
Views: 151
Reputation: 121
I think that the problem is the name of the dataframe you're passing to merge
: mfrdataframe
should instead be mfrdf
.
Upvotes: 2