Reputation: 1866
Example data:
dates=seq(as.POSIXct("2015-01-01 00:00:00"), as.POSIXct("2015-01-07 00:00:00"), by="day")
data=rnorm(7,1,2)
groupID=c(12,14,16,24,35,46,54)
DF=data.frame(Date=dates,Data=data,groupID=groupID)
BB=c(12,12,16,24,35,35)
DF[DF$groupID %in% BB,]
Date Data groupID
1 2015-01-01 4.4104202 12
3 2015-01-03 2.1557735 16
4 2015-01-04 -0.9880946 24
5 2015-01-05 -0.3396025 35
I need to filter the data frame DF
according to values in my vector BB
which match the groupID column. However, if BB
contains repetitions, this is not reflected in the result.
Since my vector BB
includes two values of 1, and two of 5, the output should in fact be:
Date Data groupID
1 2015-01-01 4.4104202 12
1 2015-01-01 4.4104202 12
3 2015-01-03 2.1557735 16
4 2015-01-04 -0.9880946 24
5 2015-01-05 -0.3396025 35
5 2015-01-05 -0.3396025 35
Is there a way to achieve this? And to keep the ordering of the vector BB
if possible?
Upvotes: 0
Views: 231
Reputation: 349
You can transform BB
into a data.frame
and use merge()
to merge DF
and BB
according to their groupID
, to be specific:
dates=seq(as.POSIXct("2015-01-01 00:00:00"), as.POSIXct("2015-01-07 00:00:00"), by="day")
groupID=c(12,14,16,24,35,46,54)
set.seed(1234)
data=rnorm(7,1,2)
DF=data.frame(Date=dates,Data=data,groupID=groupID)
BB=data.frame(groupID=c(12,12,16,24,35,35))
Test result:
>merge(DF,BB,by="groupID")
groupID Date Data
1 12 2015-01-01 -1.414131
2 12 2015-01-01 -1.414131
3 16 2015-01-03 3.168882
4 24 2015-01-04 -3.691395
5 35 2015-01-05 1.858249
6 35 2015-01-05 1.858249
Upvotes: 0
Reputation: 35314
Use match()
(or findInterval()
):
DF[match(BB,DF$groupID),];
## Date Data groupID
## 1 2015-01-01 1.2199835 12
## 1.1 2015-01-01 1.2199835 12
## 3 2015-01-03 1.8141556 16
## 4 2015-01-04 0.2748579 24
## 5 2015-01-05 3.2030200 35
## 5.1 2015-01-05 3.2030200 35
(Note that the Data
column is different because you used rnorm()
to generate it without calling set.seed()
first. It is recommended to call set.seed()
in any code sample where you incorporate randomness so that exact results can be reproduced.)
Upvotes: 1