Reputation: 1306
so I'm editing some strings in R, and I would like to delete everything that is in parentheses from a string. The problem is, I'm not very savvy with regular expressions, and it seems that any time I want to use gsub to mess with parentheses, it doesn't work, or doesn't yield the correct result.
Any hints? I have a feeling its a solvable problem. Might there be a function that I can use that isn't gsub?
Ex. Strings: "abc def (foo) abc (def)" should be stripped to "abc def abc"
If the only way to do this is to specify whats in the parentheses, that would be fine as well.
Upvotes: 1
Views: 544
Reputation: 109844
The bracketX
function in the qdap package was designed for this problem:
library(qdap)
x <- "abc def (foo) abc (def)"
bracketX(x, "round")
## > bracketX(x, "round")
## [1] "abc def abc"
Upvotes: 2
Reputation: 118779
Just another way:
x <- "abc def (foo) abc (def)"
gsub(" *\\(.*?)", "", x)
You need to escape the (
with a \
in regular expressions. In R, you need to escape twice \\
. And then you search for anything (.*
) after the (
in a non-greedy manner, with a ?
after .*
followed by )
(which you don't have to escape.
Upvotes: 3
Reputation: 25444
Parentheses are usually special characters in regular expressions, and also in those used by R. You have to escape them with the backslash \
. The trouble is that the backslash needs to be escaped in R strings as well, with a second backslash, which leads to the following rather clumsy construction:
gsub(" *\\([^)]*\\) *", " ", "abc def (foo) abc (def)")
Careful with spaces, these are not handled correctly by my gsub
call.
Upvotes: 2