Select rows from a data frame according to another vector, including repetitions

Question

Example data:

dates=seq(as.POSIXct("2015-01-01 00:00:00"), as.POSIXct("2015-01-07 00:00:00"), by="day")
data=rnorm(7,1,2)
groupID=c(12,14,16,24,35,46,54)

DF=data.frame(Date=dates,Data=data,groupID=groupID)

BB=c(12,12,16,24,35,35)
DF[DF$groupID %in% BB,]

        Date       Data groupID
1 2015-01-01  4.4104202       12
3 2015-01-03  2.1557735       16
4 2015-01-04 -0.9880946       24
5 2015-01-05 -0.3396025       35

I need to filter the data frame DF according to values in my vector BB which match the groupID column. However, if BB contains repetitions, this is not reflected in the result.

Since my vector BB includes two values of 1, and two of 5, the output should in fact be:

        Date       Data groupID
1 2015-01-01  4.4104202       12
1 2015-01-01  4.4104202       12
3 2015-01-03  2.1557735       16
4 2015-01-04 -0.9880946       24
5 2015-01-05 -0.3396025       35
5 2015-01-05 -0.3396025       35

Is there a way to achieve this? And to keep the ordering of the vector BB if possible?

bgoldst · Accepted Answer

Use match() (or findInterval()):

DF[match(BB,DF$groupID),];
##           Date      Data groupID
## 1   2015-01-01 1.2199835      12
## 1.1 2015-01-01 1.2199835      12
## 3   2015-01-03 1.8141556      16
## 4   2015-01-04 0.2748579      24
## 5   2015-01-05 3.2030200      35
## 5.1 2015-01-05 3.2030200      35

(Note that the Data column is different because you used rnorm() to generate it without calling set.seed() first. It is recommended to call set.seed() in any code sample where you incorporate randomness so that exact results can be reproduced.)

Select rows from a data frame according to another vector, including repetitions

Answers (2)

Related Questions