Reputation: 2806
I was wondering if there's an existing R function that given a text and a list of strings as input, will filter out the matching strings in the list that are found within the text?
For example,
x <- "This is a new way of doing things."
mywords <- c("This is", "new", "not", "maybe", "things.")
filtered_words <- Rfunc(x, mywords)
Then filtered_words will contain "This is", "new" and "things.".
Is there any such function?
Upvotes: 0
Views: 791
Reputation: 869
filterWords = function(x, mywords){
splitwords = unlist(strsplit(x, split = " "))
return(splitwords[splitwords%in%mywords])
}
This is one way of approach. However this will not find the the words with two sub words like "this is". But I thought it might give you little more information on what you asked.
Upvotes: 0
Reputation: 887088
We can use str_extract_all
from library(stringr)
. The output will be a list
, which can be unlist
ed to convert it to a vector
.
library(stringr)
unlist(str_extract_all(x, mywords))
#[1] "This is" "new" "things."
Upvotes: 1