RayVelcoro
RayVelcoro

Reputation: 534

Removing multiple words from a string using a vector instead of regexp in R

I would like to remove multiple words from a string in R, but would like to use a character vector instead of a regexp.

For example, if I had the string

"hello how are you" 

and wanted to remove

c("hello", "how")

I would return

" are you"

I can get close with str_remove() from stringr

"hello how are you" %>% str_remove(c("hello","how"))
[1]  "how are you"   "hello  are you"

But I'd need to do something to get this down into a single string. Is there a function that does all of this on one call?

Upvotes: 4

Views: 4705

Answers (2)

tmfmnk
tmfmnk

Reputation: 40171

A base R possibility could be:

x <- "hello how are you"   
trimws(gsub("hello|how", "\\1", x))

[1] "are you"

Or if you have more words, a clever idea proposed by @Wimpel:

words <- paste(c("hello", "how"), collapse = "|")
trimws(gsub(words, "\\1", x))

Upvotes: 2

akrun
akrun

Reputation: 887901

We can use | to evaluate as a regex OR

library(stringr)
library(magrittr)
pat <- str_c(words, collapse="|")
"hello how are you" %>% 
      str_remove_all(pat) %>%
      trimws
#[1] "are you"

data

words <- c("hello", "how")

Upvotes: 5

Related Questions