showkey
showkey

Reputation: 320

what does "\ " mean in R regular expression?

"hallo\ nworld"=="hallo nworld"
[1] TRUE

In R, \+blank =blank ? what is the meaning of this kind of grammar?

Upvotes: 1

Views: 2447

Answers (2)

poiu2000
poiu2000

Reputation: 980

In R regex there are some metacharacters (like . \ | ( ) [ { ^ $ * + ?) which have special meanings, for example . matchs a single character, + means the preceding item will be matched one or more items.

> grep("a+", c("abc.", "def", "cba a", "a.a", "a+"), value=TRUE)
[1] "abc."  "cba a" "a.a"   "a+"

In this example a+ means we want to match the string which has one or more a within it

If you want to match the string which has an actual + character within it(like the last string a+ in above case), you should use the backslash \ in your regex to make the parser interpret it differently (otherwise it would be interpreted using its special meaning), so the above example would change to:

> grep("a\\+", c("abc.", "def", "cba a", "a.a", "a+"), value=TRUE)
[1] "a+"

Note that we have \\ before + since \ itself is a metacharacter, so you need one \ for +, and one \ for \ itself

When you use backslash before non-metacharacters, its meaning is implementation-dependent, like \a is interpreted as BEL, \t as TAB, \r as CR. In your case you preceded a space with \ and it's interpreted still as space.

although \+ is not defined as an escape sequence, as I tested below:

> str <- "hello,\+world"
Error: '\+' is an unrecognized escape in character string starting ""hello,\+"

So if you want to use + in literal string you use + directly, if you want to use it in regex, you use + directly as repetition quantifier or you use the escape sequence \\+

I found the two links Regular Expressions as used in R, Regular Expression with The R Language useful, you can find more details there

Upvotes: 2

Max Candocia
Max Candocia

Reputation: 4385

\ is an escape character. It changes the meaning of the following character, although in the case of a space, it doesn't change anything. If you do '\t' you get a tab character, if you use '\n' you get a newline. \ will only work on certain characters, and all other characters it will just match them. If you want to include a '\' in your output, you need to use \\

Here's some other uses of the backslash character in regular expressions:

http://www.gnu.org/software/emacs/manual/html_node/emacs/Regexp-Backslash.html

Upvotes: 1

Related Questions