Reputation: 435
Simple question here, perhaps a duplicate of this?
I'm trying to figure out how to count the number of times a word appears in a vector. I know I can count the number of rows a word appears in, as shown here:
temp <- tibble(idvar = 1:3,
response = (c("This sounds great",
"This is a great idea that sounds great",
"What a great idea")))
temp %>% count(grepl("great", response)) # lots of ways to do this line
# answer = 3
The answer in the code above is 3 since "great" appears in three rows. However, the word "great" appears 4 different times in the vector "response". How do I find that instead?
Upvotes: 2
Views: 3288
Reputation: 974
Off the top of my head, this should solve your problem:
library(tidyverse)
temp$response %>%
str_extract_all('great') %>%
unlist %>%
length
Upvotes: 2
Reputation: 887078
We could use str_count
from stringr
to get the number of instances having 'great' in each row and then get the sum
of that count
library(tidyverse)
temp %>%
mutate(n = str_count(response, 'great')) %>%
summarise(n = sum(n))
# A tibble: 1 x 1
# n
# <int>
#1 4
Or using regmatches/gregexpr
from base R
sum(lengths(regmatches(temp$response, gregexpr('great', temp$response))))
#[1] 4
Upvotes: 3