stevec
stevec

Reputation: 52308

Programatically scale density curve made with geom_density to similar height to geom_histogram?

Suppose we make a histogram

set.seed(123)
x = rnorm(1000)
qplot(x, geom = 'blank') +
geom_histogram()

to which we add a density line

qplot(x, geom = 'blank') +
geom_histogram() +
geom_density() 

the density line is so low it is hardy visible, so it can be scaled to match the height of the histogram:

qplot(x, geom = 'blank') +
  geom_histogram(bins = 30) +
  geom_density(aes(y=0.22 * ..count..)) 

Question

How can we adjust the density line programaticlaly when not using the binwidth argument to geom_histogram (i.e. when using the bins argument).

The desired output is a geom_histogram(bins = ...) with sensibly scaled density line that doesn't rely on any manual computation of a multiplier / hard coding.

Upvotes: 1

Views: 712

Answers (2)

atsyplenkov
atsyplenkov

Reputation: 1304

You can use density estimate instead of count. It can be easily accessed now via after_stat. Take a look also at ndensity option — maybe that's you were lookin for?

library(ggplot2)
library(patchwork)

set.seed(123)
x = rnorm(1000)

# Example of kernel density estimate usage
den <- qplot(x, geom = 'blank') +
  geom_histogram(aes(y= after_stat(density))) +
  geom_density() +
  ggtitle("Density estimate")

# Example of kernel density estimate usage, scale to a maximum of 1
nden <- qplot(x, geom = 'blank') +
  geom_histogram(aes(y= after_stat(ndensity))) +
  geom_density(aes(y= after_stat(ndensity))) +
  ggtitle("Density estimate, scaled to 1")

# Plot
den | nden

enter image description here

Upvotes: 1

teunbrand
teunbrand

Reputation: 37943

Yes you can with the caveat that you have to specify the numbers of bins beforehand. This is merely because layers can share data but not don't share calculated parameters. I.e. the density layer does not know about the bins/binwidth parameter in the histogram layer. The following requires ggplot2 v3.3.0.

nbins <- 30
qplot(x, geom = 'blank') +
  geom_histogram(bins = nbins) +
  geom_density(aes(y = stage(nbins, after_stat = count * diff(range(x))/nbins))) 

Upvotes: 2

Related Questions