Saksham
Saksham

Reputation: 9380

gsub: replace regex match with regex replacement string

I have a requirement to replace more than 2 continuous 1's with equal number of zeros. Currently, I can find the match as below but I don't know how to replace with the exact number of zeros as the match is found

ind<-c(1,1,0,0,0,1,1,1,1,0,1,1,0,0,0,1,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,1,0,1,0,1,0,1,1,1,0)
gsub("([1])\\1\\1+","0",paste0(ind,collapse=""))

gives

"11000001100011010010101000101000"   

as it replaces the match with just one 0 but I need

"11000000001100011010010101000000010100000"

Upvotes: 2

Views: 295

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

You can use the following gsub replacement:

ind<-c(1,1,0,0,0,1,1,1,1,0,1,1,0,0,0,1,1,0,1,0,0,1,0,1,0,1,0,1,1,1,1,1,0,1,0,1,0,1,1,1,0)
gsub("1(?=1{2,})|(?!^)\\G1","0",paste(ind,collapse=""), perl=T)

See IDEONE demo, the result is [1] "11000000001100011010010101000000010100000".

The regex is Perl-based since it uses look-aheads and the \G operator.

This regex matches:

  • 1 - a literal 1 if...
  • (?=1{2,}) - it is followed by 2 or more 1s or...
  • (?!^)\\G1 - any 1 that is following the previous match.

For more details on the \G operator, see What good is \G in a regular expression? at perldoc.perl.org, and When is \G useful application in a regex? SO post.

Upvotes: 2

Colonel Beauvel
Colonel Beauvel

Reputation: 31161

A solution not using regex but rle:

x = rle(ind)
x$values[x$lengths>2 & x$values] <- 0
inverse.rle(x)

#[1] 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0

Upvotes: 1

Related Questions