Reputation: 31
I have two vectors (same length) that only contain 0 and 1 (for simplification I used v1 and v2 in this example). I would like to count every time both v1 and v2 have value 1 at the same position OR at at a certain range for v2. For example for a range +-3 (rad=3), if the value of v1[10] is 1, I would like to know if one of the following values of v2 is also 1: v2[7], v2[8], v2[9], v2[10], v2[11], v2[12] or v[13]. The script I have now works, but it also counts the matches when the value is 0. How can I change it so it only takes into account the matches where the value is 1?
set.seed(1)
v1 <- sample(0:1, 20, replace = TRUE)
v2 <- sample(0:1, 20, replace = TRUE)
matches <- vector()
rad <- 3
for (i in 1:length(v1)){
if ((i - rad) < 0){
matches[i] <- ifelse(v1[i] %in% v2[1:rad], TRUE, FALSE)
} else{
matches[i] <- ifelse(v1[i] %in% v2[(i-rad):(i+rad)], TRUE, FALSE)
}
}
Upvotes: 3
Views: 1056
Reputation: 13132
Another idea:
ff = function(x, y, rad)
{
wx = which(x == 1L)
wy = which(y == 1L)
i = findInterval(wx, wy, all.inside = TRUE)
ans = logical(length(x))
ans[wx[pmin(abs(wx - wy[i]), abs(wy[i + 1] - wx)) <= rad]] = TRUE
ans
}
ff(v1, v2, 3)
#[1] FALSE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE
Upvotes: 4
Reputation: 4474
And here a data.table
solution
library(data.table)
dt=data.table(v1,v2)
rad = 3L
dt[,mtch:=(v1!=0L)&(v1%in%dt$v2[max(.I-rad,0):min(.I+rad,nrow(dt))]),
by=1L:nrow(dt)]$mtch
This gives
#[1] FALSE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE
Upvotes: 0
Reputation: 26466
Yet another, using a rolling window.
library(zoo)
near<-function(x,t,rad) rollapply(x==t,1+2*rad,sum,partial=TRUE)>0
(v1==1) & near(v2,1,3)
# [1] FALSE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE FALSE TRUE
Upvotes: 1
Reputation: 3854
I'll preface my answer by saying that you're speaking R with a bit of a C accent: loops are rarely the right choice for getting something done in R. If you're new to R, you may not see the point of vectorizing everything - the justifications for using vectorized code usually talk about how it runs faster, and for small applications on a decent modern computer you're unlikely to see the difference. And you usually can hack what you want to do together using loops. But vectorization is built into the language, and in R the vectorized solution is usually a lot cleaner. Getting comfortable with vectorizing instead of doing something via loop will long term help you get a lot more out of what is really a very powerful and useful language.
All that having been said, I'm going to suggest a way you can accomplish what you want using the sapply
function. First, you can define a function that, given an index i
checks if (a) the i
th element of vector 1 is a 1
, and (b) if any of the elements in the i
plus or minus rad
entries of vector 2 are also 1
.
f <- function(i, rad, x, y) {
range_min <- max(i - rad, 1)
range_max <- min(i + rad, length(y))
a <- x[i] == 1
b <- 1 %in% y[range_min:range_max]
return(a & b)
}
Next, define your vectors and radius, and create a version of f
that accepts one argument, the index i
, and keeps the other three set to your particular values:
set.seed(1)
v1 <- sample(0:1, 20, replace = TRUE)
v2 <- sample(0:1, 20, replace = TRUE)
radius <- 3
g <- function(i) {
return(f(i, radius, v1, v2))
}
Then, you can use sapply
on the vector of indices from 1
to the length of v1
to generate a vector v
, where each element v[i]
is the result of applying g
to the i
th index:
v <- sapply(1:length(v1), g)
Note that you can combine the last two steps by defining g
within the sapply
statement, like so:
v <- sapply(1:length(v1), function(i){f(i, radius, v1, v2)})
The further you get with R, the more you'll find that code of this sort is much better supported by various features of the R language than code using loops. That being said, if you're dead set on modifying your existing code to make the loop work, you can change the logical vector you feed your ifelse
functions to be the &
of two conditions, like so:
set.seed(1)
v1 <- sample(0:1, 20, replace = TRUE)
v2 <- sample(0:1, 20, replace = TRUE)
matches <- vector()
rad <- 3
for (i in 1:length(v1)){
if ((i - rad) < 0){
matches[i] <- ifelse((v1[i] %in% v2[1:rad]) & v1[i] == 1, TRUE, FALSE)
} else{
matches[i] <- ifelse((v1[i] %in% v2[(i-rad):(i+rad)]) & v1[i] == 1, TRUE, FALSE)
}
}
Upvotes: 1