riders994
riders994

Reputation: 1306

String Editing in R - Trouble with Parentheses

so I'm editing some strings in R, and I would like to delete everything that is in parentheses from a string. The problem is, I'm not very savvy with regular expressions, and it seems that any time I want to use gsub to mess with parentheses, it doesn't work, or doesn't yield the correct result.

Any hints? I have a feeling its a solvable problem. Might there be a function that I can use that isn't gsub?

Ex. Strings: "abc def (foo) abc (def)" should be stripped to "abc def abc"

If the only way to do this is to specify whats in the parentheses, that would be fine as well.

Upvotes: 1

Views: 544

Answers (3)

Tyler Rinker
Tyler Rinker

Reputation: 109844

The bracketX function in the qdap package was designed for this problem:

library(qdap)
x <- "abc def (foo) abc (def)"
bracketX(x, "round")

## > bracketX(x, "round")
## [1] "abc def abc"

Upvotes: 2

Arun
Arun

Reputation: 118779

Just another way:

x <- "abc def (foo) abc (def)"
gsub(" *\\(.*?)", "", x)

You need to escape the ( with a \ in regular expressions. In R, you need to escape twice \\. And then you search for anything (.*) after the ( in a non-greedy manner, with a ? after .* followed by ) (which you don't have to escape.

Upvotes: 3

krlmlr
krlmlr

Reputation: 25444

Parentheses are usually special characters in regular expressions, and also in those used by R. You have to escape them with the backslash \. The trouble is that the backslash needs to be escaped in R strings as well, with a second backslash, which leads to the following rather clumsy construction:

gsub(" *\\([^)]*\\) *", " ", "abc def (foo) abc (def)")

Careful with spaces, these are not handled correctly by my gsub call.

Upvotes: 2

Related Questions