Reputation: 6778
Let's say I have the string x <- "AbC"
and I want to put an ampersand in between each letter. I would have assumed I could just do gsub("([a-zA-Z])([a-zA-Z])", "\\1 & \\2", x)
, but that produces "A & bC". Why doesn't gsub
recognize the second set of letters that match the regex? It's not like gsub
only replaces the first match found. If I have x <- "AbC DE"
and run the same command, I get "A & bC D & E".
What am I missing in terms of how gsub
is doing it's replacement? I would have expected outputs of "A & b & C" and "A & b & C D & E" from the two inputs above.
Upvotes: 2
Views: 805
Reputation: 174706
Because if a character present in one match, regex engine won't match the same character again. That is, it won't do overlapping matches.. Use lookaround to overcome this..
gsub("([a-zA-Z])(?=[a-zA-Z])", "\\1 & ", x, perl=T)
Upvotes: 10