Evaluate dnorm for multiple parameter values and the same argument

Question

I am trying to accomplish the same as in this post, namely overlaying multiple histrograms with densities. The solution in the referred post works, but it made me wonder if calculation of the dfn can be done with newer packages like purrr/purrrlyr:

set.seed(1)
df <- data.frame(bsa=rnorm(200, mean=rep(c(1,4),each=100)), 
                 group=rep(c("test","control"),each=100))

stats <- df %>% group_by(group) %>% summarise(m = mean(bsa), sd = sd(bsa))
x <- with(df, seq(min(bsa), max(bsa), len=100))

dfn <- do.call(rbind,lapply(1:nrow(stats), 
                            function(i) with(stats[i,],data.frame(group, x, y=dnorm(x,mean=m,sd=sd)))))

To perform the inner lapply part, I have been trying stuff along the lines of

stats %>%
    dplyr::group_by(group) %>%
    purrr::map(x, dnorm, m, sd)

That is, passing on m and sd from stats and using the same x. Unfortunately, it doesn't work. (Once the inner part is accomplished, I can pass on the result to do.call, so that is not a problem).

Aur&#232;le · Accepted Answer

If you go dplyr, I think you don't really need to compute stats nor x separately. I'd do:

dfn_2 <-
  df %>% 
  mutate_at(vars(bsa), funs(min, max)) %>% 
  arrange(group) %>% 
  group_by(group) %>% 
  transmute(
    x = seq(first(min), first(max), length.out = n()), 
    y = dnorm(x, mean(bsa), sd(bsa))
  ) %>% 
  as.data.frame()

all.equal(dfn, dfn_2)
# [1] TRUE

Alternatively, here are two approaches that I do not recommend. Just to demonstrate it is possible, and how you could have done what you were trying:

dfn_3 <-
  stats %>% 
  split(.$group) %>% 
  map2_df(names(.), ~ tibble(group = .y, x, y = dnorm(x, .x$m, .x$sd)))

# # A tibble: 200 x 3
#      group         x            y
#                   
#  1 control -1.888921 6.490182e-09
#  2 control -1.809524 1.045097e-08
#  3 control -1.730128 1.672139e-08
#  4 control -1.650731 2.658301e-08
#  5 control -1.571334 4.199062e-08
#  6 control -1.491938 6.590471e-08
#  7 control -1.412541 1.027772e-07
#  8 control -1.333145 1.592550e-07
#  9 control -1.253748 2.451917e-07
# 10 control -1.174352 3.750891e-07
# # ... with 190 more rows

all.equal(dfn, as.data.frame(mutate_at(dfn_3, vars(group), as.factor)))
# [1] TRUE


dfn_4 <-
  stats %>% 
  group_by(group) %>% 
  transmute(x = list(x), y = map(x, dnorm, m, sd)) %>% 
  ungroup() %>% 
  tidyr::unnest()

# # A tibble: 200 x 3
#      group         x            y
#                  
#  1 control -1.888921 6.490182e-09
#  2 control -1.809524 1.045097e-08
#  3 control -1.730128 1.672139e-08
#  4 control -1.650731 2.658301e-08
#  5 control -1.571334 4.199062e-08
#  6 control -1.491938 6.590471e-08
#  7 control -1.412541 1.027772e-07
#  8 control -1.333145 1.592550e-07
#  9 control -1.253748 2.451917e-07
# 10 control -1.174352 3.750891e-07
# # ... with 190 more rows

all.equal(dfn, as.data.frame(dfn_4))
# [1] TRUE

Evaluate dnorm for multiple parameter values and the same argument

Answers (2)

Option 1:

Option 2:

OR

THEN

Related Questions