maycca
maycca

Reputation: 4090

dplyr: Calculate percent change between summarized groups

I am trying to calculate percentage change between groups, with one control ands several treatments, organized as groups in my data.frame. As I have many observations, I am using dplyr. What I do not understand, is how to effectively set which group to compare to? Normally, I would split this task in multiple steps:

I wonder, however, if dplyr does not have a simpler a straightforward way already?

Dummy example

set.seed(5)
dd <- data.frame(id = rep(c(1:4), 3),
                 val = c(rnorm(4) +2,
                         rnorm(4) +3,
                         rnorm(4) +4),
                 grp = rep(c("control", "ch1", "ch2"), each = 4))

dd %>% 
  group_by(grp) %>% 
  summarise(my_mean = mean(val)) 

Expected outcome with calculated % change between 'control' and individual treatments:

# A tibble: 3 x 2
  grp     my_mean   perc_change
  <fct>     <dbl>
1 ch1        2.30    XX
2 ch2        5.00    YY
3 control    1.39    0

Upvotes: 1

Views: 1434

Answers (2)

AnilGoyal
AnilGoyal

Reputation: 26218

Do you want this?

library(tidyverse)
set.seed(5)
dd <- data.frame(id = rep(c(1:4), 3),
                 val = c(rnorm(4) +2,
                         rnorm(4) +3,
                         rnorm(4) +4),
                 grp = rep(c("control", "ch1", "ch2"), each = 4))

dd %>% 
  group_by(grp) %>% 
  summarise(my_mean = mean(val)) %>%
  mutate(perc_change = scales::percent((my_mean - my_mean[grp == 'control'])/my_mean[grp == 'control']))
#> # A tibble: 3 x 3
#>   grp     my_mean perc_change
#>   <chr>     <dbl> <chr>      
#> 1 ch1        3.00 63%        
#> 2 ch2        4.07 121%       
#> 3 control    1.84 0%

Created on 2021-07-31 by the reprex package (v2.0.0)

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388982

Are you looking for this?

library(dplyr)

dd %>% 
  group_by(grp) %>% 
  summarise(my_mean = mean(val))  %>%
  mutate(perc_change = (my_mean - my_mean[match('control', grp)])/ my_mean[match('control', grp)] * 100)
  #Also we can use '=='
  #mutate(perc_change = (my_mean - my_mean[grp == 'control'])/ my_mean[grp == 'control'] * 100)

Upvotes: 2

Related Questions