Reputation: 557
I am trying to apply map_dbl across a dataframe where many variables are nested. Each element of the nested variable contains a vector of 10,000 number. I have a data frame that has multiple variable like this.
For each element of the nested variable i want to extract 2.5th, 50th and 97.5th centile. I have tried to this with map_dbl and it works for each element of a single nested variable. However i am trying to make it efficient and was wondering if anyone could help.
I have given a small reproducible example below
# creates a function to extract 50th 2.5th and 97.5th centiles
percentile <- function (x,y){
map_dbl(x, quantile(~x,y))
}
x <- tibble(a = list(c(rnorm(10,1)),c(rnorm(10,2)), c(rnorm(10,3)), c(rnorm(10,4))),
b = list(c(rnorm(10,0.5)),c(rnorm(10,0.6)), c(rnorm(10,0.7)), c(rnorm(10,0.7))))
for the above tibble 'x' i would like a single 6 additional columns (each element of the column is length 1) for a_ce, a_ll, a_ul, b_ce, b_ll and b_ul
x <- x %>%
mutate_at(.vars = c('a','b'), .funs = list(ce = percentile(.,0.5))) %>%
mutate_at(.vars = c('a','b'), .funs = list(ll = percentile(.,0.025))) %>%
mutate_at(.vars = c('a','b'), .funs = list(ul = percentile(.,0.975)))
Tried to execute the above code but its giving me an error.
Thank you
Upvotes: 0
Views: 182
Reputation: 388982
You can do this in one mutate_at
call :
library(dplyr)
library(purrr)
x %>%
mutate_at(vars(c('a','b')), list(ce = ~percentile(.,0.5),
ll = ~percentile(.,0.025),
ul = ~percentile(.,0.75)))
# A tibble: 4 x 8
# a b a_ce b_ce a_ll b_ll a_ul b_ul
# <list> <list> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 <dbl [10]> <dbl [10]> 1.21 0.232 -0.232 -0.371 2.02 0.673
#2 <dbl [10]> <dbl [10]> 1.65 0.845 0.935 0.222 3.29 1.58
#3 <dbl [10]> <dbl [10]> 3.13 0.811 1.76 -0.183 3.60 1.22
#4 <dbl [10]> <dbl [10]> 3.65 1.08 2.72 -0.574 3.93 1.49
where percentile
is :
percentile <- function (x,y) map_dbl(x, quantile, y)
Note that mutate_at
is soon going to be replaced with across
in newer version of dplyr
.
Upvotes: 2