Helena
Helena

Reputation: 31

match certain value of one vector with range of values of other vector

I have two vectors (same length) that only contain 0 and 1 (for simplification I used v1 and v2 in this example). I would like to count every time both v1 and v2 have value 1 at the same position OR at at a certain range for v2. For example for a range +-3 (rad=3), if the value of v1[10] is 1, I would like to know if one of the following values of v2 is also 1: v2[7], v2[8], v2[9], v2[10], v2[11], v2[12] or v[13]. The script I have now works, but it also counts the matches when the value is 0. How can I change it so it only takes into account the matches where the value is 1?

set.seed(1)
v1 <- sample(0:1, 20, replace = TRUE)
v2 <- sample(0:1, 20, replace = TRUE)

matches <- vector()
rad <- 3 

for (i in 1:length(v1)){

  if ((i - rad) < 0){

    matches[i] <- ifelse(v1[i] %in% v2[1:rad], TRUE, FALSE)

  } else{

    matches[i] <- ifelse(v1[i] %in% v2[(i-rad):(i+rad)], TRUE, FALSE)

 }

}

Upvotes: 3

Views: 1056

Answers (4)

alexis_laz
alexis_laz

Reputation: 13132

Another idea:

ff = function(x, y, rad)
{
    wx = which(x == 1L)
    wy = which(y == 1L)
    i = findInterval(wx, wy, all.inside = TRUE)

    ans = logical(length(x))
    ans[wx[pmin(abs(wx - wy[i]), abs(wy[i + 1] - wx)) <= rad]] = TRUE
    ans
}
ff(v1, v2, 3)
#[1] FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE

Upvotes: 4

cryo111
cryo111

Reputation: 4474

And here a data.table solution

library(data.table)
dt=data.table(v1,v2)
rad = 3L
dt[,mtch:=(v1!=0L)&(v1%in%dt$v2[max(.I-rad,0):min(.I+rad,nrow(dt))]),
   by=1L:nrow(dt)]$mtch

This gives

#[1] FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE

Upvotes: 0

A. Webb
A. Webb

Reputation: 26466

Yet another, using a rolling window.

library(zoo)
near<-function(x,t,rad) rollapply(x==t,1+2*rad,sum,partial=TRUE)>0
(v1==1) & near(v2,1,3)
# [1] FALSE FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE FALSE FALSE TRUE FALSE  TRUE FALSE  TRUE  TRUE FALSE  TRUE

Upvotes: 1

Empiromancer
Empiromancer

Reputation: 3854

I'll preface my answer by saying that you're speaking R with a bit of a C accent: loops are rarely the right choice for getting something done in R. If you're new to R, you may not see the point of vectorizing everything - the justifications for using vectorized code usually talk about how it runs faster, and for small applications on a decent modern computer you're unlikely to see the difference. And you usually can hack what you want to do together using loops. But vectorization is built into the language, and in R the vectorized solution is usually a lot cleaner. Getting comfortable with vectorizing instead of doing something via loop will long term help you get a lot more out of what is really a very powerful and useful language.

All that having been said, I'm going to suggest a way you can accomplish what you want using the sapply function. First, you can define a function that, given an index i checks if (a) the ith element of vector 1 is a 1, and (b) if any of the elements in the i plus or minus rad entries of vector 2 are also 1.

f <- function(i, rad, x, y) {
  range_min <- max(i - rad, 1)
  range_max <- min(i + rad, length(y))
  a <- x[i] == 1
  b <- 1 %in% y[range_min:range_max]
  return(a & b)
}

Next, define your vectors and radius, and create a version of f that accepts one argument, the index i, and keeps the other three set to your particular values:

set.seed(1)
v1 <- sample(0:1, 20, replace = TRUE)
v2 <- sample(0:1, 20, replace = TRUE)
radius <- 3

g <- function(i) {
  return(f(i, radius, v1, v2))
}

Then, you can use sapply on the vector of indices from 1 to the length of v1 to generate a vector v, where each element v[i] is the result of applying g to the ith index:

v <- sapply(1:length(v1), g)

Note that you can combine the last two steps by defining g within the sapply statement, like so:

v <- sapply(1:length(v1), function(i){f(i, radius, v1, v2)})

The further you get with R, the more you'll find that code of this sort is much better supported by various features of the R language than code using loops. That being said, if you're dead set on modifying your existing code to make the loop work, you can change the logical vector you feed your ifelse functions to be the & of two conditions, like so:

set.seed(1)
v1 <- sample(0:1, 20, replace = TRUE)
v2 <- sample(0:1, 20, replace = TRUE)
matches <- vector()
rad <- 3 

for (i in 1:length(v1)){
  if ((i - rad) < 0){
    matches[i] <- ifelse((v1[i] %in% v2[1:rad]) & v1[i] == 1, TRUE, FALSE)
  } else{
    matches[i] <- ifelse((v1[i] %in% v2[(i-rad):(i+rad)]) & v1[i] == 1, TRUE, FALSE)
  }
}

Upvotes: 1

Related Questions