JasonAment
JasonAment

Reputation: 233

geom_density() grouped plot with discrete x axis is not smooth

I was working with a dataset that consists of two different groups of observations where the value is an integer. I wanted to plot the density of these to get a sense for how the different groups are distributed over the values.

What happened was one group had a 'smooth' density while the other had a 'wavy' density. I know this has something to do with bandwidth and the fact that my data is basically tied to discrete observations but I would love if someone can explain exactly why.

Here's an example:

data2 <- rbind(
    data.frame(group=rep('poisson1', 1000), value = rpois(1000, 5)),
    data.frame(group=rep('poisson2', 1000), value = rpois(1000, 45)))

library(ggplot2)
ggplot(data2, aes(x=value, fill=group)) +
  geom_density()

enter image description here

And strangely, I can create that dataframe again to get a new sample, and the plot sometimes is smooth: enter image description here

Upvotes: 5

Views: 1902

Answers (1)

pogibas
pogibas

Reputation: 28339

Observed smoothness (or lack of smoothness) is "caused" by rpois() function. lambda argument in rpois() function has to be non-negative mean of wanted random distribution. Therefore, when you pass lambda that is closer to zero (rpois(1000, 5)) it will generate less unique values (as it's bounded by zero).

Consider this example:

nValue <- 1e3
nLambda <- c(1:9, seq(10, 100, 10))

foo <- lapply(nLambda, function(lambda) {
    data.frame(value = rpois(nValue, lambda), lambda)
})
data <- do.call(rbind, foo)
ggplot(data, aes(value, group = lambda, color = lambda)) +
    geom_density()

enter image description here

We can see that lambda closer to zero will have peaks, while moving away from zero will generate more smooth lines.

You can also test this by looking into variance in each lambda group:

ggplot(aggregate(data$value, list(data$lambda), var), aes(Group.1, x)) +
    geom_line() +
    geom_point() +
    labs(x = "Lambda",
         y = "Variance")

enter image description here

Upvotes: 3

Related Questions