Reputation: 153
I'm finding a bit difficult to write a regex expression that converts a string of the type:
[1] "[hola;adios] address1;[hola;adios] address2"
into:
[1] "[hola|adios] address1;[hola|adios] address2"
that is, replacing the semicolons inside the brackets into vertical bars. The attempts I've made either fail to replace only the semicolons inside the brackets (the ones outside are also replaced), or they replace the entire substring [hola;adios] for a vertical bar.
I'd be very grateful if someone could give me some pointers as to how to accomplish this task using the R language
Upvotes: 2
Views: 141
Reputation: 70722
Using the gsubfn package, you could avoid having to use lookarounds.
x <- '[hola;adios] address1;[hola;adios] address2'
gsubfn('\\[[^]]*]', ~ gsub(';', '|', x), x)
# [1] "[hola|adios] address1;[hola|adios] address2"
Upvotes: 1
Reputation: 174696
You could try the below gsub commands.
> x <- '[hola;adios] address1;[hola;adios] address2'
> gsub(";(?=[^\\[\\]]*\\])", "|", x, perl=T)
[1] "[hola|adios] address1;[hola|adios] address2"
;(?=[^\\[\\]]*\\])
matches all the semicolons only if it's followed by ,
[^\[\]]*
any character but not [
or ]
, zero or more times.\]
And a closing square bracket. So this would match all the semicolons which are present inside the []
, square brackets. (?=...)
called positive lookahead assertion.OR
> gsub(";(?![^\\[\\]]*\\[)", "|", x, perl=T)
[1] "[hola|adios] address1;[hola|adios] address2"
(?!...)
called negative lookahead which does the opposite of positive lookahead assertion.
Upvotes: 3