Andri Signorell
Andri Signorell

Reputation: 1309

Unexpected replacement in R gsub regular expression

I want to replace y in the following string, but not if it's in combination with %. The following regex works fine for finding the pattern:

gsub(pattern = "([^%]y)", replacement = "*", "%x%xxxx_y_%y%y")
# [1] "%x%xxxx*_%y%y"

but replaces two characters (_y) instead just of one (y), as I would have expected. What's wrong?

Any help appreciated! Andri

Upvotes: 0

Views: 317

Answers (3)

akrun
akrun

Reputation: 887098

You could try the regex lookbehind

 gsub("(?<=[^%])y", "", "%x%xxxx_y_%y%y", perl=TRUE)
#[1] "%x%xxxx__%y%y"

Can we viewed at regex101

(?<=[^%])y

Regular expression visualization

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 269556

1) Change the parentheses in the regular expression as shown and make the corresponding change to the replacement string as follows:

gsub("([^%])y", "\\1", "%x%xxxx_y_%y%y")
## [1] "%x%xxxx__%y%y"

Here is a visualization of the regular expression:

([^%])y

Regular expression visualization

Debuggex Demo

2) It could be done with exactly your regular expression using gsubfn:

library(gsubfn)
gsubfn("([^%]y)", ~ substr(x, 1, 1), "%x%xxxx_y_%y%y")
## [1] "%x%xxxx__%y%y"

Here is a visualizatino of the regular expression:

([^%]y)

Regular expression visualization

Debuggex Demo

Update: Added visualizations.

Upvotes: 3

Avinash Raj
Avinash Raj

Reputation: 174706

For this case , you could use positive lookbehind or capturing group or \K (which discards the previously matched characters from printing at the final).

> gsub("[^%]\\Ky", "*", "%x%xxxx_y_%y%y", perl=TRUE)
[1] "%x%xxxx_*_%y%y"

\K keeps the text matched so far out of the overall regex match.

DEMO

Upvotes: 0

Related Questions