Gabriella
Gabriella

Reputation: 312

Display mean and median on two ggplot histograms

I'm a new stackoverflow user and can't comment currently on the original post to ask a question. I came across a previous stackoverflow answer (https://stackoverflow.com/a/34045068/11799491) and I was wondering how you would add two vertical lines (mean of the group and median of the group) to this graph here.

enter image description here

My attempt: I don't know how to add in the group variable "type"

geom_vline(aes(xintercept = mean(diff), ), color="black") + 
geom_vline(aes(xintercept = median(diff), ), color="red") 

Upvotes: 4

Views: 3752

Answers (2)

Rui Barradas
Rui Barradas

Reputation: 76460

The easiest way is to pre-compute the means and the medians by groups of type. I will do it with aggregate.

agg <- aggregate(diff ~ type, data, function(x) {
  c(mean = mean(x), median = median(x))
})
agg <- cbind(agg[1], agg[[2]])
agg <- reshape2::melt(agg, id.vars = "type")

library(ggplot2)

ggplot(data, aes(x = diff)) +
  geom_histogram() +
  geom_vline(data = agg, mapping = aes(xintercept = value,
                                       color = variable)) +
  facet_grid(~type) +
  theme_bw()

enter image description here

Upvotes: 3

cardinal40
cardinal40

Reputation: 1263

There are a few different ways to do this, but I like creating a separate summarized data frame and then passing that into the geom_vline call. This lets you analyze the results and makes it easy to add multiple lines that are automatically sorted and colored by type:

library(tidyverse) 

df <-
  tibble(
    x = rnorm(40),
    category = rep(c(0, 1), each = 20)
  )

df_stats <-
  df %>% 
  group_by(category) %>% 
  summarize(
    mean = mean(x), 
    median = median(x)
  ) %>% 
  gather(key = key, value = value, mean:median)

df %>% 
  ggplot(aes(x = x)) +
  geom_histogram(bins = 20) +
  facet_wrap(~ category) +
  geom_vline(data = df_stats, aes(xintercept = value, color = key))

enter image description here

Upvotes: 6

Related Questions