marcus-linmarkson
marcus-linmarkson

Reputation: 363

R: find indexes of vector in another vector (if it exists)

I would like to know the starting index of a vector in another vector. For example, for c(1, 1) and c(1, 0, 0, 1, 1, 0, 1) it would be 4.

What is important I want to look for exactly the same vector. Thus, for c(1, 1) inside c(1, 0, 1, 1, 1, 0) it is FALSE as c(1, 1) != c(1, 1, 1).

For now I am checking if the short vector is contained in the long like this:

any(with(rle(longVec), lengths[as.logical(values)]) == length(shortVec)

But I don't know how to determine the index of it...

Upvotes: 2

Views: 799

Answers (3)

RLave
RLave

Reputation: 8374

This function should work:

my_function <- function(x, find) {
  # we create two matrix from rle function
  m = matrix(unlist(rle(x)), nrow=2, byrow = T) 
  n = matrix(unlist(rle(find)), nrow=2, byrow = T)

  # for each column in m we see if its equal to n
  temp_bool = apply(m, 2, function(x) x == n) # this gives a matrix of T/F
  # then we simply sum by columns, if we have at least a 2 it means that we found (1,1) at least once
  temp_bool = apply(temp_bool, 2, sum)

  # updated part
  if (any(temp_bool==2)) {
    return(position = which(temp_bool==2)+1)
  } else {
    return(position = FALSE)
  }

}


my_function(x, find)
#[1] 4

my_function(y, find)
#[1] FALSE

To make it more clear here I show the results from those two apply:

apply(m, 2, function(x) x == n)
#       [,1]  [,2] [,3]  [,4]  [,5]
# [1,] FALSE  TRUE TRUE FALSE FALSE
# [2,]  TRUE FALSE TRUE FALSE  TRUE  # TRUE-TRUE on column 3 is what we seek

apply(temp_bool, 2, sum)
#[1] 1 1 2 0 1

Example data:

x <- c(1,0,0,1,1,0,1)
y <-  c(1,0,1,1,1,0)
find <- c(1,1) # as pointed this needs to be a pair of the same number

Upvotes: 3

G. Grothendieck
G. Grothendieck

Reputation: 270268

Assuming that shortVec contains only ones and longVec contains only zeros and ones use rle and rep to create a vector lens the same length as longVec such that each element in each run is replaced by that run's length. Then multiply that by longVec to zero out the elements corresponding to 0 in longVec. Now return the indices corresponding to elements equal to length(shortVec) and take the first.

lookup <- function(shortVec, longVec) {
  lens <- with(rle(longVec), rep(lengths, lengths))
  which(lens * longVec == length(shortVec))[1]
}

lookup(c(1,1), c(1, 0, 0, 1, 1, 0, 1))
## [1] 4

lookup(c(1,1), c(1, 0, 0, 1, 1, 1, 0, 1))
## [1] NA

Upvotes: 1

Cleland
Cleland

Reputation: 359

This works for the examples below.

a <- c(1,1)
b <- c(1,0,1,1,0,0)
c <- c(1,0,1,1,1,0)

f <- function(x, y) {
  len.x <- length(x)
  len.y <- length(y)
  for(i in 1:(len.y - (len.x - 1))) {
    if(identical(y[i:(i + (len.x - 1))], x)){
      if(y[i + len.x] != x[len.x] & y[i - 1] != x[1]) {return(TRUE)}
    }
  }
  return(FALSE)
}
f(a, b)
# TRUE
f(a, c)
# FALSE

Upvotes: 0

Related Questions