Reputation: 437
I have a large data.table with lines of text in each row. I am trying to subset the data.table by finding lines that contain one of several words. Here is what I have tried.
textDt <- data.table(LinesOfText = c("There was a small frog.","Most of the
time I ate chicken","There are so many places to stay here.","People on
stackoverflow are tremendously helpful.","Why do grapefuits cause weird drug
interactions?","If I were tiny I could fit in there"))
targetWords <- c("small","tiny","no room","cramped","mini")
targetDt <- textDt[targetWords %in% LinesOfText]
This always results in an error. I know there must be an easy solution that eludes me.
Upvotes: 1
Views: 403
Reputation: 1709
I like using stringr
because I believe it's faster. So here's a solution based on that:
library(stringr)
targetWords<- paste(targetWords, collapse = "|")
# "small|tiny|no room|cramped|mini"
targetDT<- textDt[str_detect(LinesOfText , targetWords)]
targetDT
# LinesOfText
#1: If I were tiny I could fit in there
#2: There was a small frog.
Upvotes: 1