user3030872
user3030872

Reputation: 477

Gsub regex replacement

I am trying to do a gsub replacement in R. I would like to identify two terms from two lists separated by a single whitespace and replace it with an underscore. I have successfully identified the match but I am not experienced enough in regex to understand the gsub documentation. Can somebody help write the gsub?

Right now I have:

gsub("(a|b|c)\\s+(x|y|z)","(a|b|c)_(x|y|z)",a x)

(Note: there are several places in the string that match this if that matters)

I want to go from:
a x -> a_x
b z -> b_z
hello world b x how are a z you -> hello world b_x how are a_z you... and so on.

Instead it does:
a x -> (a|b|c)(x|y|z)
b z -> (a|b|c)
(x|y|z) ... and so on.

If anyone wants to drop a little theory in that would be appreciated but I'm working on a deadline so a simultaneous answer would be ideal.

Thanks.

Upvotes: 4

Views: 3298

Answers (1)

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

You have to use \\1 and \\2 to replace the term inside the first and second () with itself.

vec <- "hello world b x how are a z you"

gsub("(a|b|c)\\s+(x|y|z)","\\1_\\2", vec)
# [1] "hello world b_x how are a_z you"

Upvotes: 4

Related Questions