asaha
asaha

Reputation: 53

How to find words contained and not contained in a list in R

I have lists of bigrams

list<-c('Financial loss','Day Trading','Trading loss','Trading criteria')

i Need to create a vector which can extract bigrams based on contained word and does not contain words.

For eg: I need to extract only Trading loss from the list so if i pass

extracted bigram<-select(list,'Trading',-matches('Day|criteria') but it doesn't work.

Thanks

Upvotes: 0

Views: 179

Answers (1)

Johan Rosa
Johan Rosa

Reputation: 3152

To do that I would create a two step operation, first to filter bigrams without a set of words and the those with the words your are looking for.

library(dplyr)
librar(stringr)

list <- c('Financial loss','Day Trading','Trading loss','Trading criteria')

bigram <- stringr::str_subset(list, pattern = "Day|criteria", negate = TRUE) %>% 
  stringr::str_subset(pattern = "Trading")

If your have a vector with the list of words you can create a patter using the paste function.

pattern <- paste(c("Day", "criteria"), collapse = "|"))

Upvotes: 2

Related Questions