Shems Vouwzee
Shems Vouwzee

Reputation: 43

R: how can i make str_count look for the number of occurences of zero between two specific numbers?

I have a dataframe with a column that represents the beginning (1) and the end (2) of an event. The duration of the event is the number of zeros between 1 and 2. This event may occur multiple times. The column looks like this:

event <- c(1002, 100000000000000000, 10002000102000, 10000, 100000210200000000, 10020000010200000)

I have tried stringr::str_count(string = event, pattern = "0"), but of course, that gives me the total number of zeros. What i need is the number of zeros between first 1, then 2. Zeros after a 2 should be dropped.

2 
17
3 1
4
5 1
2 1

I can't figure out how to do this, it could be that my approach here is all wrong. Can anyone give me some direction?

Upvotes: 2

Views: 200

Answers (2)

PaulS
PaulS

Reputation: 25323

A tidyverse approach (converting the numbers to character beforehand; the use of function format is to avoid scientific format of the numbers):

library(tidyverse)

event <- format(c(1002, 100000000000000000, 10002000102000, 10000, 100000210200000000, 10020000010200000), scientific = F)

event %>% 
  str_extract_all("(?<=1)0+") %>% 
  map(~ nchar(.x))
#> [[1]]
#> [1] 2
#> 
#> [[2]]
#> [1] 17
#> 
#> [[3]]
#> [1] 3 1
#> 
#> [[4]]
#> [1] 4
#> 
#> [[5]]
#> [1] 5 1
#> 
#> [[6]]
#> [1] 2 1

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388982

A base R option -

#To avoid scientific notation in numbers
options(scipen = 99)
sapply(strsplit(as.character(event), ''), function(x) {
  #position of 1
  one <- which(x == 1)
  #position of 2
  two <- which(x == 2)
  #If event is still going on
  if(length(two) == 0) {
    #Calculate last position - position 1
    two <- length(x)
    return(two - one)
  }
  return(two - one - 1)
})

#[[1]]
#[1] 2

#[[2]]
#[1] 17

#[[3]]
#[1] 3 1

#[[4]]
#[1] 4

#[[5]]
#[1] 5 1

#[[6]]
#[1] 2 1

Upvotes: 1

Related Questions