sun
sun

Reputation: 33

Using density in stat_bin with factor variables

It seems density plot in stat_bin doesn't work as expected for factor variables. The density is 1 for each category on y-axis.

For example, using diamonds data:

diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ]
ggplot(diamonds_small, aes(x = cut)) +  stat_bin(aes(y=..density.., fill=cut))

enter image description here

I understand I could use

stat_bin(aes(y=..count../sum(..count..), fill=cut))

to make it work. However, according to the docs of stat_bin, it should works with categorical variables.

Upvotes: 3

Views: 959

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226881

You can get what you (might) want by setting the group aesthetic manually.

ggplot(diamonds_small, aes(x = cut)) +  stat_bin(aes(y=..density..,group=1))

However, you can't easily fill differently within a group. You can summarize the data yourself:

library(plyr)
ddply(diamonds_small,.(cut),
         function(x) data.frame(dens=nrow(x)/nrow(diamonds_small)))
ggplot(dd_dens,aes(x=cut,y=dens))+geom_bar(aes(fill=cut),stat="identity")

A slightly more compact version of the summarization step:

as.data.frame.table(prop.table(table(diamonds_small$cut)))

Upvotes: 2

Related Questions