Teebs
Teebs

Reputation: 11

Using str_to locate frequency of numerous words in another data frame

I have a a data frame with one column (7,234 rows) of Youtube video titles. I have a separate list of 71 key words.

I would like to find the frequency of each key word across all 7,234 rows.

Using str_detect I'm able to find the frequency of each separate key word.

This gives me a logical result when I use summary:

Mode   FALSE    TRUE 
logical    1462    5772

I am not sure how to use a for loop to do this for all key words though, and how I can put this new data into a new dataframe, with the colnames: Video Title, Freq True, Freq False

Thanks

Upvotes: 0

Views: 37

Answers (1)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21432

You don't need a for loop. Just isolate all words, count them and filter the key words with their frequencies:

Toy data:

words <- c("apple", "pear", "grape")
sentences <- c("I have an apple and a pear", 
               "Grape is my favorite but I also like apple", 
               "I don't like pear and I don't like apple or applepie",
               "She hates fruit")

library(dplyr)
library(tidyr)
data.frame(sentences) %>%
  # separate sentences into single words:
  separate_rows(sentences, sep = "\\s") %>%
  # convert to lower-case:
  mutate(sentences = tolower(sentences)) %>%
  group_by(sentences) %>%
  # count:
  summarise(freq = n()) %>%
  filter(sentences %in% words)
# A tibble: 3 x 2
  value  freq
* <chr> <int>
1 apple     3
2 grape     1
3 pear      2

Upvotes: 0

Related Questions