How to find all combinations in column and count occurrences in data

Question

I am trying to find all actual combinations within my data of values in column 1.

I then want to count all occurrences of these by column 2.

It feels like R should be able to do this fairly quickly. I tried reading up on combn and expand.grid, but with no success. The main problem was I could not find any guidance on how to generate combinations within a column.

My data looks like:

Animal (n=57) | Person ID (n=1000)
Dog     | 0001
Cat     | 0004
Bird    | 0001
Snake   | 0002 
Spider  | 0002
Cat     | 0003
Dog     | 0004

Expected output is:

AnimalComb | CountbyID

Cat         | 1
DogBird     | 1
SnakeSpider | 1
CatDog      | 1

EDIT deleted an erroneous entry for cat

Ronak Shah · Accepted Answer

If I have understood you correctly, you need to group_by PersonID and paste the all the unique Animals in the group and count the number of occurrence of their combination which can be done counting the number of rows in the group (n()) and dividing it by number of distinct values (n_distinct).

library(dplyr)

df %>%
  group_by(PersonID) %>%
  summarise(AnimalComb = paste(unique(Animal), collapse = ""), 
            CountbyID = n() / n_distinct(Animal)) 

#  PersonID AnimalComb  CountbyID
#                 
#1        1 DogBird             1
#2        2 SnakeSpider         1
#3        3 Cat                 1
#4        4 CatDog              1

How to find all combinations in column and count occurrences in data

Answers (2)

data

Related Questions