Counting strings in R

Question

I have a data-set as below. I would like to group by then count the number of strings. Many thanks in advance.

SO = c("Journal Of Business", "Journal Of Business", "Journal of Economy")

AU_UN = c("Dartmouth Coll;Wellesley Coll;Wellesley Coll",                                                                                             
          "Georgetown Univ;Fed Reserve Syst",
          "Georgetown Univ;Fed Reserve Syst")

df <- data.frame(SO, AU_UN);df

Expected Answer

Journal Of Business      Dartmouth Coll (1);Wellesley Coll (2);  Georgetown Univ (1);Fed Reserve Syst (1)
Journal of Economy       Georgetown Univ (1); Fed Reserve Syst (1)

G. Grothendieck · Accepted Answer

Use separate_rows to convert to long form, count the rows and convert back with summarize.

library(dplyr)
library(tidyr)

df %>% 
  separate_rows(AU_UN, sep = ";") %>% 
  count(SO, AU_UN) %>% 
  group_by(SO) %>% 
  summarize(AU_UN = paste(sprintf("%s (%d)", AU_UN, n), collapse=";"), .groups = "drop")

giving:

# A tibble: 2 x 2
  SO                  AU_UN                                                                         
                                                                                          
1 Journal Of Business Dartmouth Coll (1);Fed Reserve Syst (1);Georgetown Univ (1);Wellesley Coll (2)
2 Journal of Economy  Fed Reserve Syst (1);Georgetown Univ (1)

Counting strings in R

Answers (2)

Related Questions