Adrian
Adrian

Reputation: 9793

R: find the position in a vector that does not decrease for a certain number of iterations

numbers <- c(1, 0.9, 0.8, 0.7, 0.71, 0.7, 0.72, 0.69, 0.696, 0.697, 0.7, 
0.71, 0.72, 0.55, 0.6, 0.66, 0.55, 0.56, 0.58)

Given numbers, I want to find the first index for which the value in numbers does not decrease for the next n = 5 values. In the above example, the index I'm looking for is 8 because when numbers[8] <= numbers[9:(8 + 5)]. Here's my attempt:

myfun <- function(numbers, n){
  for(i in 1:length(numbers)){
    if(all(numbers[i] <= numbers[(i + 1):(i + n)])){
      return(i)
    }
  }
}

> myfun(numbers, 5)
[1] 8

Is there a quicker way to obtain the answer without writing a for loop?

Upvotes: 0

Views: 229

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388992

EDIT

I think I misunderstood the question earlier (thanks to @thelatemail for bringing that into notice). You want to find out the value which is less than all of next n values.

You can do this with rolling operations.

n <- 5
which(zoo::rollapply(numbers, n, function(x) all(x >= x[1])))[1]
#[1] 8

Earlier answer

This returns the index of continually increasing sequence in numbers.

You can use rle :

n <- 5
with(rle(diff(numbers) > 0), 
      sum(lengths[seq_len(which(lengths >= n & values)[1] - 1)])) + 1
#[1] 8

You can break it down for better understanding :

diff gives difference between consecutive numbers.

diff(numbers)
# [1] -0.100 -0.100 -0.100  0.010 -0.010  0.020 -0.030  0.006  0.001
#[10]  0.003  0.010  0.010 -0.170  0.050  0.060 -0.110  0.010  0.020

We compare it with > 0 to get TRUE for increasing values and FALSE for decreasing.

diff(numbers) > 0
# [1] FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
#[12]  TRUE FALSE  TRUE  TRUE FALSE  TRUE  TRUE

We apply rle over it :

tmp <- rle(diff(numbers) > 0)
tmp
#Run Length Encoding
#  lengths: int [1:10] 3 1 1 1 1 5 1 2 1 2
#  values : logi [1:10] FALSE TRUE FALSE TRUE FALSE TRUE ...

We find a position where the length of increasing sequence is greater than equal to n

tmp$lengths >= n & tmp$values
#[1] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE

Use which to get it's index, [1] to select 1st one if there are multiple :

which(tmp$lengths >= n & tmp$values)[1]
[1] 6

sum all the lengths before this index so -1 to above number

sum(tmp$lengths[seq_len(which(tmp$lengths >= n & tmp$values)) - 1])
#[1] 7

Now add +1 to above number get next index.

If you use this step-by-step approach you could handle different edge cases more easily rather than the one-liner at the top.

Upvotes: 2

Related Questions