Dave van Brecht
Dave van Brecht

Reputation: 514

ggplot density plot produce unexpected result

I have a question regarding density plots using ggplot. In order to make my problem clear I have created the following sample data:

DT2 <- data.table(Rating = c(1:19),
            Nndef = c(50, 30, 70, 70, 60, 40, 60, 30, 30, 10,
                      5, 3, 1, 0, 0, 0, 0, 0, 0))

Now I want to graph a density plot of the number of Nndefs per rating category. Before I do this I copy each row by the number of Nndefs such that each rating catagory occurs Nndef times.

DT2 <- DT2[rep(1:.N,Nndef)]

Now this should do the trick:

ggplot(DT2, aes(x =Rating))+ theme_bw() +
geom_density(aes(x=Rating))

which gives me enter image description here

This is actually what I expect to happen using this data. However, consider this now

DT1 <- data.table(Rating = c(1:19),
            Nndef = c(460, 480, 1300, 2600, 5700, 4700, 9300, 10600, 7700, 8200,
                      6500, 6700, 5300, 4700, 2700, 1100, 1200, 400, 420))
DT1 <- DT1[rep(1:.N,Nndef)]
ggplot(DT1, aes(x =Rating))+ theme_bw() +
geom_density(aes(x=Rating))

which results in this enter image description here

I am familiar with the adjust argument in geom_density but I am running a lot of these ggplots in a for loop. I want to obtain a smooth density plot (just like the first one using DT2) but not want to manually adjust each figure myself. In addition, I do not understand why it produces a kinky density distribution in the latter case and a reasonably accurate in the former case. Any thoughts?

Thank you in advance

Upvotes: 3

Views: 1087

Answers (1)

IRTFM
IRTFM

Reputation: 263301

You could limit the adjustment factor to a fraction of the number of unique values of 'x':

ggplot(DT1, aes(x =Rating))+ theme_bw() +
      geom_density(aes(x=Rating), adjust=length(unique(x))/10)

enter image description here

Upvotes: 4

Related Questions