SSP
SSP

Reputation: 77

Count the number of words using the given set of keywords in R

How can I count the number of words in each observation using the given fixed keywords? To clarify, here is an example.

Here are "Text" and the set of "Keywords"

Text=c("I have bought a shirt from the store", "This shirt looks very good")
Keywords=c("have", "from", "good")

I would like to obtain the following output.

output=c(2,1)

In the first sentence in "Text" (i.e., "I have bought a shirt from the store") I observe the "Keywords" two times. "have" and "from." Likewise, in the second sentence in "Text", I observe the "Keywords" once "good."

Upvotes: 0

Views: 133

Answers (2)

Ronin scholar
Ronin scholar

Reputation: 104

You can use this call: unlist(lapply(lapply(Text,stringr::str_detect,Keywords),sum))

lapply allows you to apply functions to each element of vector, so this call:

  1. applies str_detect to each element in Text
  2. applies sum to each resulting element of p.1
  3. unlist gives you desired vector from the list

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389215

You can add word boundaries (\\b) to Keywords and collapse them into one string to use it in str_count.

library(stringr)
str_count(Text, str_c('\\b',Keywords, '\\b', collapse = '|'))
#[1] 2 1

In base R, you could use regmatches + gregexpr.

lengths(regmatches(Text, gregexpr(paste0('\\b',Keywords, '\\b', collapse = '|'), Text)))

Upvotes: 1

Related Questions