tospig
tospig

Reputation: 8343

subset data matching two variables in another data frame

Situation

I have two data frames, df1 and df2:

id <- c(1, 2, 3, 4, 5, 6, 7, 8)
position <- c("AA", "BB", "CC", "DD", "EE", "FF", "GG", "HH")
value <- c(100, 200, 300, 400, 500, 600, 700, 800)
df1 <- data.frame(id, position, value)

id <- c(1, 2, 3, 4, 6, 8)
position <- c("AA", "BB", "CC", "EE", "DD", "HH")
value <- c(700, 800, 900, 100, 200, 900)
df2 <- data.frame(id, position, value)

Question

I would like to select all the rows in df1 where there is a matching id AND position in df2.

Attempts

I'm pretty sure this is quite simple but I'm struggling to find a suitable solution.

I can do this using merge, but I'd like to avoid having to remove columns and hard-coding variable names.

#this method works for this small example but I have many variables and don't want
#to write them all out when removing the unwanted ones
df3 <- merge(x=df1, y=df2, by=c("id","position"))
df3 <- df3[,-4]    #remove unused column
df3

I thought I could use dplyr, something like

#library(dplyr)
df3 <- filter(df1, id %in% df2$id) 
df3

But I'm not having much luck matching on two variables.

Desired output

  id position value.x
1  1       AA     100
2  2       BB     200
3  3       CC     300
4  8       HH     800

Any help appreciated.

Upvotes: 3

Views: 2600

Answers (2)

thelatemail
thelatemail

Reputation: 93908

merge could still be used quite efficiently, without the need to remove columns after merging.

keys <- c("id","position")
merge(df1,df2[keys],by=keys)

#  id position value
#1  1       AA   100
#2  2       BB   200
#3  3       CC   300
#4  8       HH   800

Upvotes: 3

nicola
nicola

Reputation: 24490

You can paste the two columns together and match them:

    df1[match(paste(df2$id,df2$position),paste(df1$id,df1$position),nomatch=0),]
    #id position value
    #1  1       AA   100
    #2  2       BB   200
    #3  3       CC   300
    #8  8       HH   800

Upvotes: 1

Related Questions