GoGonzo
GoGonzo

Reputation: 2877

Display custom axis labels in ggplot2

I'd like to plot histogram and density on the same plot. What I would like to add to the following is custom y-axis label which would be something like sprintf("[%s] %s", ..density.., ..count..) - two numbers at one tick value. Is it possible to obtain this with scale_y_continuous or do I need to work this around somehow?

Below current progress using scales::trans_new and sec_axis. sec_axis is kind of acceptable but the most desirable output is as on the image below.

set.seed(1)
var <- rnorm(4000)
binwidth <- 2 * IQR(var) / length(var) ^ (1 / 3)


count_and_proportion_label <- function(x) {
  sprintf("%s [%.2f%%]", x, x/sum(x) * 100)
}


ggplot(data = data.frame(var = var), aes(x = var, y = ..count..)) +
  geom_histogram(binwidth = binwidth) +
  geom_density(aes(y = ..count.. * binwidth)) +
  scale_y_continuous(
    # this way
    trans = trans_new(name = "count_and_proportion",
                      format =  count_and_proportion_label,
                      transform = function(x) x,
                      inverse = function(x) x),
    # or this way
    sec.axis = sec_axis(trans = ~./sum(.),
                        labels = percent,
                        name = "proportion (in %)")
  )

I've tried to create object with breaks before basing on the graphics::hist output - but these two histogram differs.

bins <- (max(var) - min(var))/binwidth
hdata <- hist(var, breaks = bins, right = FALSE)
# hist generates different bins than `ggplot2`

At the end I would like to get something like this:

enter image description here

Upvotes: 0

Views: 923

Answers (2)

Joris
Joris

Reputation: 417

You can achieve your desired output by creating a custom set of labels, and adding it to the plot:

library(tidyverse)
library(ggplot2)

set.seed(1)
var <- rnorm(400)
bins <- .1

df <- data.frame(yvals = seq(0, 20, 5), labels = c("[0%]", "[10%]", "[20%]", "[30%]", "[40%]"))
df <- df %>% tidyr::unite("custom_labels", labels,  yvals, sep = " ", remove = TRUE)

ggplot(data = data.frame(var = var), aes(x = var, y = ..count..)) +
  geom_histogram(aes(y = ..count..), binwidth = bins) +
  geom_density(aes(y = ..count.. * bins), color = "black", alpha = 0.7) +
  ylab("[density] count") +
  scale_y_continuous(breaks = seq(0, 20, 5), labels = df$custom_labels)

enter image description here

Upvotes: 0

mhovd
mhovd

Reputation: 4087

Would it be acceptable to add percentage as a secondary axis? E.g.

your_plot + scale_y_continuous(sec.axis = sec_axis(~.*2, name = "[%]"))

Perhaps it would be possible to overlay the secondary axis on the primary one, but I'm not sure how you would go about doing that.

Example plot

Upvotes: 1

Related Questions