tjr
tjr

Reputation: 691

Error with merging dataframes using two columns

I have successfully merged dataframes using only one column but am having trouble merging on two. I have a data set with various information on counties/states and want to merge with lat/long of that county/state from another data set.

new1

Month     County ST Bag1   zip
10  MIDDLESEX CT    1 06037
10  MIDDLESEX CT    1 06037
9 NEW LONDON CT   NA 06355
9 NEW LONDON CT   NA 06355
9 NEW LONDON CT   NA 06355
9 NEW LONDON CT   NA 06355
9 NEW LONDON CT   NA 06355
9 NEW LONDON CT    0 06355
10 NEW LONDON CT    5 06355
10 NEW LONDON CT    3 06355

new2

NAME ST FIPS    Latitude    Longitude
LITCHFIELD CT 9005 41.79274312 -73.24448889
HARTFORD CT 9003 41.80570047 -72.73328657
TOLLAND CT 9013 41.85298958   -72.337294
WINDHAM CT 9015 41.82754918 -71.98851586
NEW LONDON CT 9011 41.48228923 -72.10131783
FAIRFIELD CT 9001 41.27179117 -73.38473902
MIDDLESEX CT 9007 41.46673693 -72.53722401
NEW HAVEN CT 9009 41.41124892 -72.93600942

The code I am using is:

fullDataX = merge(new, new2[ , c("NAME","ST","Latitude", "Longitude")], 
       by.x=c("County","NAME"),by.y=c("ST","ST"),all.x=T)

But I get the following error:

Error in [.data.frame(CountyData, CountyData$ST == "CT", c("NAME", "ST", : undefined columns selected

I have also tried a left join in dplyr but get the following error:

Error: cannot join on columns 'ST' x 'ST' : index out of bounds

Any thoughts would be greatly appreciated. I have looked at numerous other questions and answers but can't seem to find an appropriate solution. Thanks.

Upvotes: 0

Views: 3328

Answers (1)

Ruthger Righart
Ruthger Righart

Reputation: 4921

The following may help you:

fullDataX = merge(new1, new2, by.x = c("County", "ST"), by.y = c("NAME", "ST"))

Upvotes: 2

Related Questions