chan chong
chan chong

Reputation: 145

forming and using Regular expressions in R

Som I am new to R. I was learning this concept of forming regular expressions.

i.e. something like this "(\\2.\\3)". What are these? I mean, what do these numbers and notation represents? Can anyone explain in a very layman language what does this mean? Or something like this, (\2.\4)(\2.\4), what does it mean? Thanks for any help!

Upvotes: 6

Views: 1356

Answers (1)

hwnd
hwnd

Reputation: 70732

They are called backreferences which recall what was matched by a capturing group. A capturing group can be created by placing the characters to be grouped inside a set of parenthesis ( ). A backreference is specified as a backslash (\) in R, two backslashes (\\); followed by a digit indicating the number of the group to be recalled.

Below is an example replacing using backreferences to recall what was matched by capturing group #2 and #3 ...

x <- 'foo bar baz quz'
sub('(\\S+) (\\S+) (\\S+) (\\S+)', '(\\2.\\3)', x)
# [1] "(bar.baz)"

Note: The opening and closing parenthesis in the replacement along with the dot are literal characters.

Upvotes: 6

Related Questions