Reputation: 163
I am trying to filter a dataframe in R as follows.
Let mydf be the dataframe having two columns A and B.
Let udf be another dataframe having 1 column A.
I want to do the following.
Select rows from mydf where mydf[A] is in udf[A]
I am using dplyr and tried something on the lines as
T = filter(mydf, A %in% udf['A'])
That clearly doesn't work. Is there a straightforward workaround for this without explicitly writing for loop ? Thanks a lot!
Upvotes: 3
Views: 4730
Reputation: 887163
You could use inner_join
from dplyr
library(dplyr)
r1 <- inner_join(mydf, udf, by='A')
Or using filter
as commented by @BondedDust
r2 <- filter(mydf, A %in% udf[['A']])
identical(r1, r2)
#[1] TRUE
Or using data.table
library(data.table)
setkey(setDT(mydf),A)[udf, nomatch=0]
set.seed(24)
mydf <- as.data.frame(matrix(sample(1:10,2*10, replace=TRUE),
ncol=2, dimnames=list(NULL, LETTERS[1:2])) )
set.seed(29)
udf <- data.frame(A=sample(1:10,6,replace=TRUE))
Upvotes: 2
Reputation: 1246
You can simply pip data and use the left_join
function.
Here is a reproducible example for this:
set.seed(123)
colors<- c( rep("yellow", 5), rep("blue", 5), rep("green", 5) )
shapes<- c("circle", "star", "oblong")
numbers<-sample(1:15,replace=T)
group<-sample(LETTERS, 15, replace=T)
mydf<-data.frame(colors,shapes,numbers,group)
mydf
mydf2<- mydf %>%
filter (colors=="yellow")
mydf3 <- mydf %>% left_join(mydf2)
Upvotes: 0