Jake L
Jake L

Reputation: 1057

calculate density of one point in groups

I am plotting some density curves, and I want to add a point at the mean of each group. However, I want to plot these points along the top of the density curve, not at 0. Is there a way to come up with a value of the density at the mean point within groups? code follows:

# make df
df<- data.frame(group=c("a","b",'c'),
           value=rnorm(
             3000,
             mean=c(1,2,3),
             sd=c(1,1.5,1)
           )) 
library(tidyverse)
library(ggridges)
library(ggdist)

Way 1: density ridges from ggridges ppackage

df %>%

  # calculate mean density per group to use later
  group_by(group)%>%
  mutate(mean_value=mean(value)) %>%
    
  
  ggplot()+
  aes(x=value,y=group)+
  geom_density_ridges()+
  
  # could do with stat summary - blue points
  stat_summary(
    orientation = "y",
    fun = mean,
    geom = "point", 
    color="blue"
  )+
  
  # or could do with geom_point using precalculated value (red points)
  # nudged so we can see both. 
  geom_point(aes(x=mean_value,y=group),
             color="red",
             position = position_nudge(x=.1)
             )

way 2: stat_halfeye from ggdist package

df %>%
  group_by(group)%>%
  mutate(mean_value=mean(value)) %>%
  
  # mutate(mean_density = density(mean_value,value))
  
  
  ggplot()+
  aes(x=value,y=group)+
  stat_halfeye()+
  
  # could do with stat summary
  stat_summary(
    orientation = "y",
    fun = mean,
    geom = "point", 
    color="blue",
    alpha=.8
  )+
  
  # or could do with geom_point using precalculated value
  # nudged so we can see both. 
  geom_point(aes(x=mean_value,y=group),
             color="red",
             position = position_nudge(x=.1)
  )

desired output: for these blue or red points to be at the top of the density curve. So I will need a y aesthetic that is something like "group + density value."

Would rather use way 2 (ggdist) than geom_density ridges

Thanks

Upvotes: 2

Views: 589

Answers (1)

eipi10
eipi10

Reputation: 93761

I'm not sure if there's a way to calculate the height of the density curve at the mean value within the ggplot geom/stat functions, so I've created a couple of helper functions to do that.

dens_at_mean calculates the height of the density curve at the mean of the data. get_mean_coords runs dens_at_mean by group and then scales the height values to match the y-values generated by stat_halfeye and returns a data frame that can be passed to geom_point.

# Reproducible data
set.seed(394)
df<- data.frame(group=c("a","b",'c'),
                value=rnorm(
                  3000,
                  mean=c(1,2,3),
                  sd=c(1,1.5,1)
                )) 

# Function to get height of density curve at mean value
dens_at_mean = function(x) { 
  d = density(x)
  mean.x = mean(x)
  data.frame(mean.x = mean.x,
             max.y = max(d$y),
             mean.y = approx(d$x, d$y, xout=mean.x)$y)
}

# Function to return data frame with properly scaled heights 
#  to plot mean points
get_mean_coords = function(data, value.var, group.var) {

  data %>% 
    group_by({{group.var}}) %>% 
    summarise(vals = list(dens_at_mean({{value.var}}))) %>% 
    ungroup %>% 
    unnest_wider(vals) %>% 
    # Scale y-value to work properly with stat_halfeye
    mutate(mean.y = (mean.y/max(max.y) * 0.9 + 1:n())) %>% 
    select(-max.y)
}

df %>%
  ggplot()+
    aes(x=value, y=group)+
    stat_halfeye() +
    geom_point(data=get_mean_coords(df, value, group), 
               aes(x=mean.x, y=mean.y),
               color="red", size=2) +
    theme_bw() +
    scale_y_discrete(expand=c(0.08,0.05))

enter image description here

Upvotes: 3

Related Questions