Daniel
Daniel

Reputation: 435

Count number of times a word appears (dplyr)

Simple question here, perhaps a duplicate of this?

I'm trying to figure out how to count the number of times a word appears in a vector. I know I can count the number of rows a word appears in, as shown here:

temp <- tibble(idvar = 1:3, 
               response = (c("This sounds great",
                      "This is a great idea that sounds great",
                      "What a great idea")))
temp %>% count(grepl("great", response)) # lots of ways to do this line
# answer = 3

The answer in the code above is 3 since "great" appears in three rows. However, the word "great" appears 4 different times in the vector "response". How do I find that instead?

Upvotes: 2

Views: 3288

Answers (2)

Vlad C.
Vlad C.

Reputation: 974

Off the top of my head, this should solve your problem:

library(tidyverse)
temp$response %>% 
  str_extract_all('great') %>%
  unlist %>%
  length

Upvotes: 2

akrun
akrun

Reputation: 887078

We could use str_count from stringr to get the number of instances having 'great' in each row and then get the sum of that count

library(tidyverse)
temp %>% 
   mutate(n = str_count(response, 'great')) %>%
   summarise(n = sum(n))
# A tibble: 1 x 1
#      n
#   <int>
#1     4

Or using regmatches/gregexpr from base R

sum(lengths(regmatches(temp$response, gregexpr('great', temp$response))))
#[1] 4

Upvotes: 3

Related Questions