Eisen
Eisen

Reputation: 1897

Group by count by a subgroup in R

I have example code here

df |>
  dplyr::group_by(label) |>
  dplyr::summarize(avg_col = mean(count_col, na.rm = TRUE),
                   med_col = median(count_col, na.rm  = TRUE),
                   n = n()) |>
  dplyr::arrange(desc(avg_col))

I want to get the percentage of times count_col is 1. How can i do this in the summarize statement? I first need to filter to count_col ==1 than divide it by the total count.

I'm nto sure how to do this thoough.

Upvotes: 1

Views: 33

Answers (1)

akrun
akrun

Reputation: 887851

We can use mean on a logical vector to get the percentage (* 100)

df |>
  dplyr::group_by(label) |>
  dplyr::summarize(perc_count_one = 100 * mean(count_col == 1, na.rm TRUE), 
                   avg_col = mean(count_col, na.rm = TRUE),
                   med_col = median(count_col, na.rm  = TRUE),
                   n = n()) |>
  dplyr::arrange(desc(avg_col))

Upvotes: 0

Related Questions