Reputation: 1999
Suppose we have two groups, "a" and "b", of different sample size.
n = 10000
set.seed(123)
dist1 = round(rnorm(n, mean = 1, sd=0.5), digits = 1)
dist2 = round(rnorm(n/10, mean = 2, sd = 0.2), digits = 1)
df = data.frame(group=c(rep("a", n), rep("b", n/10)), value=c(dist1,dist2))
I would like to translate the following stacked barplot to a stacked density plot.
library(ggplot2)
ggplot(data=df, aes(x=value, y=(..count..)/sum(..count..), fill=group)) +
geom_bar()
I know there is an option position="stack"
for density plots. However, the result looks as follows, since the height of the density is with respect to the group sample size, not the total sample size. Hence, the small group is, in a way, overrepresented.
ggplot(data=df, aes(x=value, fill=group)) +
geom_density(position="stack")
Is there a way to create a density plot that corresponds to the above barplot?
Upvotes: 5
Views: 1599
Reputation: 5530
Does just doing the same thing with the density chart as you did with the bar chart not give you what you're looking for?
ggplot(data=df, aes(x=value, fill=group)) +
geom_density( aes(y = ..count../sum(..count..)), position="stack", alpha=.7)
which gives
Upvotes: 5
Reputation: 46898
If you do a density plot, the y-axis is different from that you get from the first histogram, where your y-axis reflects the counts over total . To get something close to that, you can try below, where the histogram function is used to get the counts, converted and then stacked:
library(dplyr)
library(ggplot2)
RN =range(df$value)
df %>% group_by(group) %>%
do(data.frame(hist(.$value,breaks=seq(RN[1],RN[2],
length.out=40),plot=FALSE)[c("mids","counts")])) %>%
mutate(freq=counts/nrow(df)) %>%
ggplot(aes(x=mids,y=freq,col=group)) + geom_line(position="stack")
Upvotes: 0