l0o0
l0o0

Reputation: 783

ggplot2 density plotting different size of data in R

I have two data sets, their size is 500 and 1000. I want to plot density for these two data sets in one plot.
I have done some search in google.

the data sets in above threads are the same

df <- data.frame(x = rnorm(1000, 0, 1), y = rnorm(1000, 0, 2), z = rnorm(1000, 2, 1.5))

But if I have different data size, I should normalize the data first in order to compare the density between data sets.

Is it possible to make density plot with different data size in ggplot2?

Upvotes: 9

Views: 5768

Answers (1)

Claus Wilke
Claus Wilke

Reputation: 17790

By default, all densities are scaled to unit area. If you have two datasets with different amounts of data, you can plot them together like so:

df1 <- data.frame(x = rnorm(1000, 0, 2))
df2 <- data.frame(y = rnorm(500, 1, 1))

ggplot() + 
  geom_density(data = df1, aes(x = x), 
               fill = "#E69F00", color = "black", alpha = 0.7) + 
  geom_density(data = df2, aes(x = y),
               fill = "#56B4E9", color = "black", alpha = 0.7)

enter image description here

However, from your latest comment, I take that that's not what you want. Instead, you want the areas under the density curves to be scaled relative to the amount of data in each group. You can do that with the ..count.. aesthetics:

df1 <- data.frame(x = rnorm(1000, 0, 2), label=rep('df1', 1000))
df2 <- data.frame(x = rnorm(500, 1, 1), label=rep('df2', 500))
df=rbind(df1, df2)

ggplot(df, aes(x, y=..count.., fill=label)) + 
  geom_density(color = "black", alpha = 0.7) + 
  scale_fill_manual(values = c("#E69F00", "#56B4E9"))

enter image description here

Upvotes: 10

Related Questions