jmarkov
jmarkov

Reputation: 193

deleting multiple substrings of string

I'm using R and I have a vector of strings with 1 and 2.

Examples of strings could be the following:

  1. "11111111******111"
  2. "11111111111***2222222"
  3. "1111*****22222**111*****1111"

where "*" denote a gap.

I'm interested in deleting substrings of gaps shorter than a certain number n.

Example with sequences above:

I decided that n=3, so...

1. "11111111******111"
2. "111111111112222222"
3. "1111*****22222111*****1111"

In the second and third string the "function" deleted a substring of 3 gaps and 2 gaps, because I wanted to delete all substrings of gaps shorter or equal 3.

Upvotes: 0

Views: 69

Answers (3)

Shenglin Chen
Shenglin Chen

Reputation: 4554

gsub('(?<=\\d)(\\*{1,3})(?=\\d)','',v1,perl=T)
[1] "11111111******111"          "111111111112222222"         "1111*****22222111*****1111"

Upvotes: 0

emilliman5
emilliman5

Reputation: 5956

Similar to @akrun's answer:

x<- list("11111111******111",
"11111111111***2222222",
"1111*****22222**111*****1111")

lapply(x, function(x) gsub("(\\d)\\*{,3}(\\d)", "\\1\\2", x, perl = TRUE))

Upvotes: 0

akrun
akrun

Reputation: 887148

May be we can do

n <-3
pat <- sprintf("(?<=[0-9])\\*{1,%d}(?=[0-9])", n)
gsub(pat, "", v1, perl = TRUE)
#[1] "11111111******111"          "111111111112222222"         
#[3]  "1111*****22222111*****1111"

data

v1 <- c("11111111******111", "11111111111***2222222", "1111*****22222**111*****1111")

Upvotes: 1

Related Questions