Crops
Crops

Reputation: 5154

Group-wise histogram and density plots with entire group having missing data in `ggplot2`

I am trying to plot several group-wise histograms with density, mean and counts using ggplot2 as below.

library(ggplot2)
library(dplyr)

data(mtcars)
mtcars$gear <- as.factor(mtcars$gear)

mtcars_summ <- summarise(mtcars, .by = gear,
                         count = n(),
                         mean = mean(qsec, na.rm = TRUE))


ggplot() +
  geom_histogram(data = mtcars[, setdiff(colnames(mtcars), "gear")],
                 mapping = aes(x = qsec),
                 alpha = 0.2, bins = 10) +
  geom_histogram(data = mtcars,
                 mapping = aes(x = qsec, fill = gear, colour = gear),
                 alpha = 0.5, bins = 10) +
  geom_density(data = mtcars,
               mapping = aes(x = qsec, fill = gear, colour = gear,
                             y = after_stat(count)),
               alpha = 0.01) +
  geom_vline(data = mtcars_summ,
             aes(xintercept = mean, colour = gear),
             linetype = "dashed") +
  geom_text(data = mtcars_summ,
            aes(vjust = 1, hjust = 1.5,
                colour = gear, label = paste("n =", count)),
                x = Inf, y = Inf) +
  facet_wrap(~gear)

enter image description here

But for some of my datasets, an entire group has missing data. In such cases, the plots get messed up.

mtcars[mtcars$gear == 5, ]$qsec <- NA

mtcars_summ <- summarise(mtcars, .by = gear,
                         count = sum(!is.na(qsec)),
                         mean = mean(qsec, na.rm = TRUE))


ggplot() +
  geom_histogram(data = mtcars[, setdiff(colnames(mtcars), "gear")],
                 mapping = aes(x = qsec),
                 alpha = 0.2, bins = 10) +
  geom_histogram(data = mtcars,
                 mapping = aes(x = qsec, fill = gear, colour = gear),
                 alpha = 0.5, bins = 10) +
  geom_density(data = mtcars,
               mapping = aes(x = qsec, fill = gear, colour = gear,
                             y = after_stat(count)),
               alpha = 0.01) +
  geom_vline(data = mtcars_summ,
             aes(xintercept = mean, colour = gear),
             linetype = "dashed") +
  geom_text(data = mtcars_summ,
            aes(vjust = 1, hjust = 1.5,
                colour = gear, label = paste("n =", count)),
                x = Inf, y = Inf) +
  facet_wrap(~gear)


geom_histogram is completely ignoring the group with missing data for fixing the fill and colour scales.

How to get colours and group orders consistent with those plots where there is no missing data ?

enter image description here

Upvotes: 0

Views: 44

Answers (0)

Related Questions