tjebo
tjebo

Reputation: 23757

Find first occurrence of value in vector, and return length of vector if value not present

I would like to find first occurrence of a value in a vector. The value can be present or not. If not present I would like to get the length of the vector.

Why I want this: This is to slice a data frame by group, from first row up to (and including) the first row with the occurrence of the value. Or all rows if the value is not present. See below my approach for the latter as well. Maybe there is no need to take the detour over the vectors, and there is a more direct approach for this, and I'd appreciate a hint/solution very much, but this question is more about the vector problem. Thanks!

x <- 0:1
y <- c(0:2, 2)
z <- c(y, 3)

# Those approaches with max/min(which) do not work 
max(which(x < 2))
#> [1] 2
## desired result should be 3
max(which(y < 2))
#> [1] 2
## != does of course also not work
max(which(z != 2))
#> [1] 5

## desired result
library(dplyr)

## my way for the vectors
my_vecs <- list(x, y, z)
my_len <- lengths(my_vecs)
my_ind <- sapply(my_vecs, function(u) which(u == 2)[1])

coalesce(my_ind, my_len)
#> [1] 2 3 3

## in a dataframe
foo <- data.frame(id = letters[rep(my_len, my_len)], n = c(x,y,z))
foo %>%
  group_by(id) %>%
  mutate(cens = which(n == 2)[1], 
         cens = ifelse(is.na(cens), n(), cens)) %>%
  slice(1:max(cens))
#> # A tibble: 8 × 3
#> # Groups:   id [3]
#>   id        n  cens
#>   <chr> <dbl> <int>
#> 1 b         0     2
#> 2 b         1     2
#> 3 d         0     3
#> 4 d         1     3
#> 5 d         2     3
#> 6 e         0     3
#> 7 e         1     3
#> 8 e         2     3

Upvotes: 0

Views: 318

Answers (1)

tjebo
tjebo

Reputation: 23757

match has the third argument no_match which shortens a lengthy if else construction considerably and makes really neat code.

x <- 0:1
y <- c(0:2, 2)
z <- c(y, 3)

sapply(list(x, y, z), function(u) match(2, u, length(u)))
#> [1] 2 3 3

Applied to the data frame problem in the question, this will give:

library(dplyr)

foo %>%
  group_by(id) %>%
  ## note: n and n() are not the same! the first refers to the column, the other is a dplyr function
  mutate(cens = match(2, n, n()))%>%
  slice(1:max(cens))
#> # A tibble: 8 × 3
#> # Groups:   id [3]
#>   id        n  cens
#>   <chr> <dbl> <int>
#> 1 b         0     2
#> 2 b         1     2
#> 3 d         0     3
#> 4 d         1     3
#> 5 d         2     3
#> 6 e         0     3
#> 7 e         1     3
#> 8 e         2     3

Upvotes: 1

Related Questions