R - Counting total occurrences of word from list that appears in data frame, and grouped

Question

I have a data frame like this:

ID   Word
1    Tree
1    House
1    Tree
2    Snail
2    Tree
3    Car

And I have a list of keywords I want to check for:

(House, Tree, Bird)

I want to know how many times for each ID, any word in my list of keywords appears.

I.e. the word House, Tree or Bird appears 3 times in ID(1), and House, Tree or Bird appears only once in ID(2), and there are no occurrences in ID(3)

ID   Count
1     3
2     1
3     0

I am not sure how to tackle this. I know how to count the number of times a word appears within each ID, but not how many times the words from another list appear.

Thank you for any suggestions/guidance etc.

akrun · Accepted Answer

We can create a logical index and get the sum grouped by 'ID'. Not sure whether the 'v1' is vector or list (if it is list, then unlist(v1) and use it with the same code)

library(dplyr)
df1 %>% 
   group_by(ID) %>% 
   summarise(Count = sum(Word %in% v1))
# A tibble: 3 x 2
#     ID Count
#   
#1     1     3
#2     2     1
#3     3     0

Or filter and then do a count

df1 %>% 
   filter(Word %in% v1) %>%
   count(ID, .drop = FALSE)

data

v1 <- c("House", "Tree", "Bird")
df1 <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 3L), Word = c("Tree", 
"House", "Tree", "Snail", "Tree", "Car")), class = "data.frame", 
row.names = c(NA, 
-6L))

R - Counting total occurrences of word from list that appears in data frame, and grouped

Answers (2)

data

Related Questions