Reputation: 359
I'm trying to use purrr:map
to create empirical cumulative percentages for values of x_var in a df that are unique to each level of a factor variable.
Ideally, I'd like a result to be a long df where the columns look as follows:
levels (long) | x_var | epcd_val
Here's an example:
# load packs
if(!require("pacman"))install.packages("pacman")
p_load(dplyr, tibble, purrr)
# generate fake data
samp_dat <- tibble(
x_var = rnorm (1000, 0, 1),
levels = sample(LETTERS[1:4], 1000, replace=TRUE, prob=c(0.25, 0.50, 0.125, 0.125)))
# generates a list of ecdf functions for each level
ecdfs <- samp_dat %>%
group_split(levels) %>%
map(., ~ ecdf(.x$x_var))
The resulting ecdfs
is a list of edcf functions, which is unique to each level within levels.
I somehow then need to feed the x_var values, grouped on levels, back into this function. However, I'm stuck on how to pull it off with pipes.
Upvotes: 2
Views: 113
Reputation: 887118
The ecdf
outputs a function, so we feed the 'x_var' into the output function after grouping by 'levels'
library(dplyr)
samp_dat %>%
group_by(levels) %>%
mutate(newval = ecdf(x_var)(x_var))
Upvotes: 3