Counting "Yes" in multiple columns in a data-frame using dplyr

Question

Suppose I am having the following data. [As requested I am adding the data]

col1 <- c("Team A", "Team A", "Team A", "Team B", "Team B", "Team B", "Team C", "Team C", "Team C", "Team D", "Team D", "Team D")
col2 <- c("High",   "Medium", "Medium", "Low", "Low", "Low", "High", "Medium", "Low", "Medium", "Medium", "Medium")
col3 <- c("Yes", "Yes", "No", "No", "No", "Yes", "No", "Yes", "No", "Yes", "Yes", "Yes")
col4 <- c("No", "Yes", "No", "Yes", "Yes", "No", "No", "Yes", "No", "Yes", "No", "Yes")
df <- data.frame(col1, col2, col3, col4)

# Col1      Col2    Col3    Col4
# Team A    High    Yes     No
# Team A    Medium  Yes     Yes
# Team A    Medium  No      No
# Team B    Low     No      Yes
# Team B    Low     No      Yes
# Team B    Low     Yes     No
# Team C    High    No      No
# Team C    Medium  Yes    Yes
# Team C    Low     No     No 
# Team D    Medium  Yes    Yes
# Team D    Medium  Yes    No
# Team D    Medium  Yes    Yes

I want to use dplyr function to get the following result. Status_1 would need to be the count of the number of "Yes" in Col3 against each team while Status_2 would be the count of "Yes" in Col4 against each team

       High Medium  Low Status_1    Status_2
Team A    1      2    0        2           1
Team B    0      0    3        1           2
Team C    1      1    1        1           1
Team D    0      3    0        3           2

I am able to generate the normal summary but for the last two columns of "Status_1" and "Status_2" using the following statement. Can anyone help, please?

df %>%
  group_by(Col1, Col2) %>%
  summarise(Count = n()) %>%
  spread(Col1, Count, fill = 0)

Darren Tsai · Accepted Answer

First, group the data by col1 to count the number of Yes in col3 and col4. Then group again by all columns and count the number of observations in each group using n(). Finally, use tidyr::pivot_wider to transform data from long to wide.

df %>%
  group_by(col1) %>%
  mutate_at(vars(col3:col4), ~ sum(. == "Yes")) %>%
  rename(status_1 = col3, status_2 = col4) %>% 
  group_by_all %>%
  summarise(n = n()) %>%
  tidyr::pivot_wider(names_from = col2, values_from = n, values_fill = list(n = 0))

# # A tibble: 4 x 6
#   col1   status_1 status_2  High Medium   Low
#                
# 1 Team A        2        1     1      2     0
# 2 Team B        1        2     0      0     3
# 3 Team C        1        1     1      1     1
# 4 Team D        3        2     0      3     0

Counting "Yes" in multiple columns in a data-frame using dplyr

Answers (2)

Related Questions

Counting &quot;Yes&quot; in multiple columns in a data-frame using dplyr

Answers (2)

Related Questions

Counting "Yes" in multiple columns in a data-frame using dplyr