Reputation: 193
I'm using R and I have a vector of strings with 1 and 2.
Examples of strings could be the following:
"11111111******111"
"11111111111***2222222"
"1111*****22222**111*****1111"
where "*"
denote a gap.
I'm interested in deleting substrings of gaps shorter than a certain number n.
Example with sequences above:
I decided that n=3, so...
1. "11111111******111"
2. "111111111112222222"
3. "1111*****22222111*****1111"
In the second and third string the "function" deleted a substring of 3 gaps and 2 gaps, because I wanted to delete all substrings of gaps shorter or equal 3.
Upvotes: 0
Views: 69
Reputation: 4554
gsub('(?<=\\d)(\\*{1,3})(?=\\d)','',v1,perl=T)
[1] "11111111******111" "111111111112222222" "1111*****22222111*****1111"
Upvotes: 0
Reputation: 5956
Similar to @akrun's answer:
x<- list("11111111******111",
"11111111111***2222222",
"1111*****22222**111*****1111")
lapply(x, function(x) gsub("(\\d)\\*{,3}(\\d)", "\\1\\2", x, perl = TRUE))
Upvotes: 0
Reputation: 887148
May be we can do
n <-3
pat <- sprintf("(?<=[0-9])\\*{1,%d}(?=[0-9])", n)
gsub(pat, "", v1, perl = TRUE)
#[1] "11111111******111" "111111111112222222"
#[3] "1111*****22222111*****1111"
v1 <- c("11111111******111", "11111111111***2222222", "1111*****22222**111*****1111")
Upvotes: 1