Reputation: 691
I have successfully merged dataframes using only one column but am having trouble merging on two. I have a data set with various information on counties/states and want to merge with lat/long of that county/state from another data set.
new1
Month County ST Bag1 zip
10 MIDDLESEX CT 1 06037
10 MIDDLESEX CT 1 06037
9 NEW LONDON CT NA 06355
9 NEW LONDON CT NA 06355
9 NEW LONDON CT NA 06355
9 NEW LONDON CT NA 06355
9 NEW LONDON CT NA 06355
9 NEW LONDON CT 0 06355
10 NEW LONDON CT 5 06355
10 NEW LONDON CT 3 06355
new2
NAME ST FIPS Latitude Longitude
LITCHFIELD CT 9005 41.79274312 -73.24448889
HARTFORD CT 9003 41.80570047 -72.73328657
TOLLAND CT 9013 41.85298958 -72.337294
WINDHAM CT 9015 41.82754918 -71.98851586
NEW LONDON CT 9011 41.48228923 -72.10131783
FAIRFIELD CT 9001 41.27179117 -73.38473902
MIDDLESEX CT 9007 41.46673693 -72.53722401
NEW HAVEN CT 9009 41.41124892 -72.93600942
The code I am using is:
fullDataX = merge(new, new2[ , c("NAME","ST","Latitude", "Longitude")],
by.x=c("County","NAME"),by.y=c("ST","ST"),all.x=T)
But I get the following error:
Error in
[.data.frame
(CountyData, CountyData$ST == "CT", c("NAME", "ST", : undefined columns selected
I have also tried a left join in dplyr but get the following error:
Error: cannot join on columns 'ST' x 'ST' : index out of bounds
Any thoughts would be greatly appreciated. I have looked at numerous other questions and answers but can't seem to find an appropriate solution. Thanks.
Upvotes: 0
Views: 3328
Reputation: 4921
The following may help you:
fullDataX = merge(new1, new2, by.x = c("County", "ST"), by.y = c("NAME", "ST"))
Upvotes: 2