Reputation: 1897
I have example code here
df |>
dplyr::group_by(label) |>
dplyr::summarize(avg_col = mean(count_col, na.rm = TRUE),
med_col = median(count_col, na.rm = TRUE),
n = n()) |>
dplyr::arrange(desc(avg_col))
I want to get the percentage of times count_col is 1. How can i do this in the summarize statement? I first need to filter to count_col ==1 than divide it by the total count.
I'm nto sure how to do this thoough.
Upvotes: 1
Views: 33
Reputation: 887851
We can use mean
on a logical vector to get the percentage (* 100)
df |>
dplyr::group_by(label) |>
dplyr::summarize(perc_count_one = 100 * mean(count_col == 1, na.rm TRUE),
avg_col = mean(count_col, na.rm = TRUE),
med_col = median(count_col, na.rm = TRUE),
n = n()) |>
dplyr::arrange(desc(avg_col))
Upvotes: 0