Reputation: 1057
I am plotting some density curves, and I want to add a point at the mean of each group. However, I want to plot these points along the top of the density curve, not at 0. Is there a way to come up with a value of the density at the mean point within groups? code follows:
# make df
df<- data.frame(group=c("a","b",'c'),
value=rnorm(
3000,
mean=c(1,2,3),
sd=c(1,1.5,1)
))
library(tidyverse)
library(ggridges)
library(ggdist)
Way 1: density ridges from ggridges ppackage
df %>%
# calculate mean density per group to use later
group_by(group)%>%
mutate(mean_value=mean(value)) %>%
ggplot()+
aes(x=value,y=group)+
geom_density_ridges()+
# could do with stat summary - blue points
stat_summary(
orientation = "y",
fun = mean,
geom = "point",
color="blue"
)+
# or could do with geom_point using precalculated value (red points)
# nudged so we can see both.
geom_point(aes(x=mean_value,y=group),
color="red",
position = position_nudge(x=.1)
)
way 2: stat_halfeye from ggdist package
df %>%
group_by(group)%>%
mutate(mean_value=mean(value)) %>%
# mutate(mean_density = density(mean_value,value))
ggplot()+
aes(x=value,y=group)+
stat_halfeye()+
# could do with stat summary
stat_summary(
orientation = "y",
fun = mean,
geom = "point",
color="blue",
alpha=.8
)+
# or could do with geom_point using precalculated value
# nudged so we can see both.
geom_point(aes(x=mean_value,y=group),
color="red",
position = position_nudge(x=.1)
)
desired output: for these blue or red points to be at the top of the density curve. So I will need a y aesthetic that is something like "group + density value."
Would rather use way 2 (ggdist) than geom_density ridges
Thanks
Upvotes: 2
Views: 589
Reputation: 93761
I'm not sure if there's a way to calculate the height of the density curve at the mean value within the ggplot geom/stat functions, so I've created a couple of helper functions to do that.
dens_at_mean
calculates the height of the density curve at the mean of the data. get_mean_coords
runs dens_at_mean
by group and then scales the height values to match the y-values generated by stat_halfeye
and returns a data frame that can be passed to geom_point
.
# Reproducible data
set.seed(394)
df<- data.frame(group=c("a","b",'c'),
value=rnorm(
3000,
mean=c(1,2,3),
sd=c(1,1.5,1)
))
# Function to get height of density curve at mean value
dens_at_mean = function(x) {
d = density(x)
mean.x = mean(x)
data.frame(mean.x = mean.x,
max.y = max(d$y),
mean.y = approx(d$x, d$y, xout=mean.x)$y)
}
# Function to return data frame with properly scaled heights
# to plot mean points
get_mean_coords = function(data, value.var, group.var) {
data %>%
group_by({{group.var}}) %>%
summarise(vals = list(dens_at_mean({{value.var}}))) %>%
ungroup %>%
unnest_wider(vals) %>%
# Scale y-value to work properly with stat_halfeye
mutate(mean.y = (mean.y/max(max.y) * 0.9 + 1:n())) %>%
select(-max.y)
}
df %>%
ggplot()+
aes(x=value, y=group)+
stat_halfeye() +
geom_point(data=get_mean_coords(df, value, group),
aes(x=mean.x, y=mean.y),
color="red", size=2) +
theme_bw() +
scale_y_discrete(expand=c(0.08,0.05))
Upvotes: 3