user5705407
user5705407

Reputation: 65

match and count frequencies of words exactly from a string in R

So I have a block of text like this:

"the worst most unprofessional ... I wouldn't recommend...I commend her for hardwork......"

How can I match exact word "commend", count their frequencies?

Problem:

I'm trying to count how many times the word commend appear.

wrds <- gregexpr(pattern = "^commend$", string, fixed = TRUE)[[1]]
length(wrds)

but it returns -1

and if i try:

gregexpr(pattern = "commend", string, fixed = TRUE)[[1]]

the output is 2, counting both commend and recommend

What am i missing with gregexpr?

Upvotes: 2

Views: 860

Answers (1)

Andrew Schwartz
Andrew Schwartz

Reputation: 4657

  1. Don't use fixed = TRUE. We want a regexp, not a string.
  2. Use the word boundary character \b. To use this in a string in R you need to escape the backslash: "\\b"

    wrds <- gregexpr(pattern = "\\bcommend\\b", string)[[1]]

Definitely don't use the extra spaces. This will fail to match "commend," and so many other possibilities. That's what the word boundary is for.

Upvotes: 3

Related Questions