Reputation: 21
I have data formatted as a string of 1s and 0s, similar to the following:
string <- c("110010100010101000000011100101")
From it, I want to extract all mutually exclusive strings that:
So for the string I presented above, using str_extract_all()
, I want the output to look like:
[1] "11001010001010100000" "11100101"
Instead, I get:
> str_extract_all(string,"1(\\d+)(0{0,10})")
[[1]]
[1] "110010100010101000000011100101"
How might I edit the R regex expression to achieve this goal? Could this be done using grep functions in R instead of stringr?
Upvotes: 2
Views: 110
Reputation: 1052
library(stringr)
string <- c("110010100010101000000011100101")
str_extract_all(string, '1[0-1]*?(0{5}|$)')
#> [[1]]
#> [1] "11001010001010100000" "11100101"
Upvotes: 3