Rilo Dinga
Rilo Dinga

Reputation: 79

r frequency counts on overlayed histogram and density plot

I am interested in adding frequency count on a histogram overlayed by density plot. This question is similar to a question already posted on SO by other user. I tried the solution provided for that question and it did not work.

This is my test dataset

df <- data.frame(cond = factor( rep(c("A","B"), each=200)), 
                 rating = c(rnorm(200), rnorm(200, mean=.8)))

This will plot a histogram with counts

ggplot(df, aes(x=rating)) + geom_histogram(binwidth=.5, colour="black", fill="white")

This will plot a density plot like this

ggplot(df, aes(x=rating)) + geom_density()

I try to combine the two,

ggplot(df, aes(x=rating)) + geom_histogram(aes(y=..count..), binwidth=.5, colour="black", fill="white") + geom_density(alpha=.2, fill="#FF6666")

The overlaid density plot is gone.

I tried this approach

ggplot(df, aes(x=rating)) + geom_histogram(binwidth=0.5, colour="black", fill="white") + stat_bin(aes(y=..count.., ,binwidth=0.5,label=..count..), geom="text", vjust=-.5) + geom_density(alpha=.2, fill="#FF6666")

This is almost okay but does not show the density plot and overwites my bindwidth value (head scratcher).

How do I keep the histograms with counts and show the overlaid density plot ?

Upvotes: 2

Views: 1851

Answers (1)

Peter
Peter

Reputation: 12699

This will resolve your problem. The issue is related to the binwidth You need to adjust the y values for the density plot by the count and the bin width, as density always = 1.

library(ggplot2)

set.seed(1234)

df <- data.frame(cond = factor( rep(c("A","B"), each=200)), 
                 rating = c(rnorm(200), rnorm(200, mean=.8)))

ggplot(df, aes(x=rating)) + 
  geom_histogram(aes(y = ..count..), binwidth = 0.5, colour = "black", fill="white") +
  stat_bin(aes(y=..count.., binwidth = 0.5,label=..count..), geom="text", vjust=-.5) + 
  geom_density(aes(y = ..count.. * 0.5), alpha=.2, fill="#FF6666")


# This is more elegant: using the built-in computed variables for the geom_ functions


ggplot(df, aes(x = rating)) + 
  geom_histogram(aes(y = ..ncount..), binwidth = 0.5, colour = "black", fill="white") +
  stat_bin(aes(y=..ncount.., binwidth = 0.5,label=..count..), geom="text", vjust=-.5) + 
  geom_density(aes(y = ..scaled..), alpha=.2, fill="#FF6666")

Which results in:

enter image description here

Upvotes: 4

Related Questions