Reputation: 35
I have a column of a data frame that consists of dates, let's say in this form
d$x<-c("2014-05-01 11:36:12", "2014-05-01 11:36:14", "2014-05-01 11:36:15",
"2014-05-01 11:36:16", "2014-05-01 11:36:16", "2014-05-01 11:36:17")
and want to build intervals with
for each row. Then I have to find whether another column's elements, suppose it is
d$y<-c("2014-05-01 11:38:21", "2014-05-01 11:42:26", "2014-05-01 11:47:37",
"2014-05-01 11:53:44" ,"2014-05-01 11:59:23", "2014-05-01 12:04:39")
belongs in any of these intervals or not.
I used the for
loop and if
, but my data are very long so this does not seem like a good option to me. length(d$x)
is around 36000 and length(d$y) = 100
. Here's my current code:
$k<-rep(0,length(d$x))
for (i in 1:(length(d$y))) {
for (j in 1:(d$x)) {
if ((d$y[i] <= d$x[j]+60) & (d$y[i] >=d$x[j]-60))
k[i]=i
}
}
Upvotes: 1
Views: 658
Reputation: 56159
Using sqldf
package:
#data
datx <- data.frame(x=as.POSIXct(c("2014-05-01 11:36:12",
"2014-05-01 11:36:14",
"2014-05-01 11:36:15",
"2014-05-01 11:36:16",
"2014-05-01 11:36:16",
"2014-05-01 11:36:17")))
daty <- data.frame(y=as.POSIXct(c("2014-05-01 11:38:21",
"2014-05-01 11:42:26",
"2014-05-01 11:38:33",
"2014-05-01 11:53:44",
"2014-05-01 11:59:23",
"2014-05-01 12:04:39")))
myInterval <- 180 #3*60sec = 3 minutes
require(sqldf)
#get y within x +/-interval
res <-
fn$sqldf("SELECT distinct(b.y)
FROM datx a, daty b
WHERE b.y BETWEEN a.x-$myInterval
AND a.x+$myInterval")
#output
res
# y
# 1 2014-05-01 11:38:21
# 2 2014-05-01 11:38:33
#get index
which(daty$y %in% res$y)
# [1] 1 3
Upvotes: 2