Reputation: 7928
I have two data frames, x and y.
x<-data.frame(id=c(1,2,3,4,5), g=c(21,52,43,94,35))
y<-data.frame(id=c(3,4,7), u=c(55, 77, 99))
I want to subset x to include only the observations with "IDs" that are also in y.
What is the best way of doing this?
Thanks!
Upvotes: 2
Views: 7769
Reputation: 141
The accepted answer only works because the values 3 and 4 in x$id happen to be located in rows 3 and 4. The wrong answer will be obtained, for example, if:
x<-data.frame(id=c(1,3,2,4,5), g=c(21,52,43,94,35))
x[intersect(x$id, y$id),]
id g
3 2 43
4 4 94
The following will work properly, regardless of the position of the common elements:
x[is.element(x$id,intersect(x$id,y$id)),]
Upvotes: 3
Reputation: 61154
Use setdiff
to exclude observations appearing in both df
> x[setdiff(x$id, y$id),]
id g
1 1 21
2 2 52
5 5 35
Use merge
to include observations present in both df
> merge(x, y)
id g u
1 3 43 55
2 4 94 77
or looking for this subset?
> x[intersect(x$id, y$id),]
id g
3 3 43
4 4 94
Upvotes: 6