user3919790
user3919790

Reputation: 557

map across nested data frame with mutate_at

I am trying to apply map_dbl across a dataframe where many variables are nested. Each element of the nested variable contains a vector of 10,000 number. I have a data frame that has multiple variable like this.

For each element of the nested variable i want to extract 2.5th, 50th and 97.5th centile. I have tried to this with map_dbl and it works for each element of a single nested variable. However i am trying to make it efficient and was wondering if anyone could help.

I have given a small reproducible example below

# creates a function to extract 50th 2.5th and 97.5th centiles 
percentile <- function (x,y){
              map_dbl(x, quantile(~x,y))
              }


x <- tibble(a = list(c(rnorm(10,1)),c(rnorm(10,2)), c(rnorm(10,3)), c(rnorm(10,4))),
        b = list(c(rnorm(10,0.5)),c(rnorm(10,0.6)), c(rnorm(10,0.7)), c(rnorm(10,0.7))))

for the above tibble 'x' i would like a single 6 additional columns (each element of the column is length 1) for a_ce, a_ll, a_ul, b_ce, b_ll and b_ul

x <- x %>% 
  mutate_at(.vars = c('a','b'), .funs = list(ce = percentile(.,0.5))) %>% 
  mutate_at(.vars = c('a','b'), .funs = list(ll = percentile(.,0.025))) %>% 
  mutate_at(.vars = c('a','b'), .funs = list(ul = percentile(.,0.975)))

Tried to execute the above code but its giving me an error.

Thank you

Upvotes: 0

Views: 182

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You can do this in one mutate_at call :

library(dplyr)
library(purrr)

x %>% 
  mutate_at(vars(c('a','b')), list(ce = ~percentile(.,0.5),
                                   ll = ~percentile(.,0.025), 
                                   ul = ~percentile(.,0.75)))

# A tibble: 4 x 8
#  a          b           a_ce  b_ce   a_ll   b_ll  a_ul  b_ul
#  <list>     <list>     <dbl> <dbl>  <dbl>  <dbl> <dbl> <dbl>
#1 <dbl [10]> <dbl [10]>  1.21 0.232 -0.232 -0.371  2.02 0.673
#2 <dbl [10]> <dbl [10]>  1.65 0.845  0.935  0.222  3.29 1.58 
#3 <dbl [10]> <dbl [10]>  3.13 0.811  1.76  -0.183  3.60 1.22 
#4 <dbl [10]> <dbl [10]>  3.65 1.08   2.72  -0.574  3.93 1.49 

where percentile is :

percentile <- function (x,y) map_dbl(x, quantile, y)

Note that mutate_at is soon going to be replaced with across in newer version of dplyr.

Upvotes: 2

Related Questions