Calculating proportion of values in a column based on different column in R

Question

Say I have a dataframe with two columns, such as

data.frame(experiment = rep(c('e1', 'e2'),each = 3), 
           outcomes = c('NH', 'NH', 'NH', 'H', 'NH', 'H'))

For each value in a column, I want to calculate the proportion of values that a particular value in a different column. So for my example, I want to calculate the proportion of outcomes in e1 and in e2 that are 'NH'. Thus, the final result is:

experiment	Proportion
e1	1
e2	0.333

akrun · Accepted Answer

We could use a group by mean on the logical vector

library(dplyr)
df1 %>%
   group_by(experiment) %>%
   summarise(Proportion = mean(outcomes == 'NH'))
# A tibble: 2 x 2
  experiment Proportion
             
1 e1              1    
2 e2              0.333

Or use table/proportions in base R

 proportions(table(df1), 1)[, 'NH', drop = FALSE]
          outcomes
experiment        NH
        e1 1.0000000
        e2 0.3333333

data

df1 <- structure(list(experiment = c("e1", "e1", "e1", "e2", "e2", "e2"
), outcomes = c("NH", "NH", "NH", "H", "NH", "H")), class = "data.frame", 
row.names = c(NA, 
-6L))

Calculating proportion of values in a column based on different column in R

Answers (2)

data

Related Questions