Reputation: 701
I'm using the r package fuzzyjoin to join two data sets. Currently I am joining on one column, and would like to join on two.
I've tried adding in the column names I wish to join as a vector but this doesn't seem to work. I get an error that says:
Error: Each variable must be a 1d atomic vector or list. Problem variables: col.
#This works to join on 1 column
library(fuzzyjoin)
stringdist_inner_join(Dataset1, Data2, by ="Name", distance_col = NULL)
#Joiningontwocolumns
stringdist_inner_join(Dataset1, Dataset2, by =c("Name","TM"), distance_col = NULL)
Dataset1:
Name Config TM
ALTO D BB T
CONTRA ST D
EIGHT A DD D
OPALAS BB T
SAUSALITO Y AA D
SOLANO J ST D
Dataset2:
Name Age Rank TM
ALTO D 50 2 T
ALTO D 20 6 D
CONTRA 10 10 D
CONTRA 15 15 T
EIGHTH 18 21 T
OPAL 19 4 T
SAUSALITO 2 12 D
SOLANO 34 43 D
Upvotes: 4
Views: 2789
Reputation: 6230
It took a while for me to figure out but I believe the correct syntax for multiple column joins is:
stringdist_inner_join(data1, data2,
by = list(x = c("Name", "TM"), y = c("Name", "TM")),
distance_col = NULL))
Upvotes: 2