David Arenburg
David Arenburg

Reputation: 92292

How to remove everything after two different punctuations

Consider

temp <- c("12/30 - 1/5", "4/21-4/27")
##[1] "12/30 - 1/5" "4/21-4/27"

I need

##[1] "12/30"     "4/21" 

while I know how to produce each one of them separately

gsub(" .*", "", temp)
##[1] "12/30"     "4/21-4/27"

gsub("-.*", "", temp)
##[1] "12/30 " "4/21" 

How can I combine them into one expression?

Upvotes: -1

Views: 62

Answers (2)

Tim Pietzcker
Tim Pietzcker

Reputation: 336158

That's what character classes are there for:

> gsub("[ -].*", "", temp)
[1] "12/30" "4/21"

One caveat: In a character class, the dash takes on a special meaning unless it's in the first or last position of the class: It then denotes a range (as in [0-9] which matches any digit between 0 and 9. If you wanted to match only 0, 9 or a literal -, you would have to use [09-]). In the current regex, that's not an issue because there only are two characters inside the class. But when you start expanding the class (adding new characters), make sure you keep the dash at the end.

Upvotes: 5

Stephan Kolassa
Stephan Kolassa

Reputation: 8267

You could probably OR the regexps together, but I personally find that hard to read. Easier to just apply one gsub after the other:

> gsub("\\-.*", "",gsub("\\ .*", "", temp))
[1] "12/30" "4/21"

Upvotes: 0

Related Questions