Reputation: 1651
I do not understand why I am required to use two backslashes to prevent a reversal of my backreference. Below, I detail how I discovered my problem:
I wanted to transform a character that looks like this:
x <- 53/100 000
And transform it to look like this:
53/100000
Here are a few ideas I had before I came to ask this question:
I thought that I could use the function gsub
to remove all spaces that occur after the /
character. However, I thought that a regex solution might be more elegant/efficient.
At first, I didn't know how to backreference in regex, so I tried this:
> gsub("/.+\\s",".+",x)
[1] "53.+000"
Then I read that you can backreference captured patterns using \1
from this website. So I began to use this:
> gsub("/.+\\s","\1",x)
[1] "53\001000"
Then I realized that the backreference only considers the wildcard match. But I wanted to keep the /
character. So I added it back in:
> gsub("/.+\\s","/\1",x)
[1] "53/\001000"
I then tried a bunch of other things, but I fixed it by adding an extra backslash and enclosing my wildcard in parentheses:
> gsub("/(.+)\\s","/\\1",x)
[1] "53/100000"
Moreover, I was able to remove the /
character from my replacement by inserting the left parenthesis at the beginning of the pattern:
> gsub("(/.+)\\s","\\1",x)
[1] "53/100000"
Hm, so it seemed two things were required: parentheses and an extra backslash. The parentheses I understand I think, because I believe the parentheses indicate what is the part of text that you are backreferencing.
What I do not understand is why two backslashes are required. From the reference website it is said that only \l
is required. What's going on here? Why is my backreference being reversed?
Upvotes: 1
Views: 237
Reputation: 585
The extra backslash is required so that R doesn't parse the "\1" as an escape character before passing it to gsub. "\\1" is read as the regex \1 by gsub.
Upvotes: 3