Noah
Noah

Reputation: 11

Plot frequency histogram with a 4-level factor using ggplot2

I have data that I would like to put into a histogram showing the frequencies within each subset of 4 factors. I would like them to be on the same histogram and in different colors. The ..ncount.. function looks to be the best, but it is normalizing the data to a maximum of 1, while I want the SUM of all of the frequencies within a subset to be equal to 1. Here is the code I have used and the accompanying graphs:

my data file is: "assocID" The factor is: "category" the continuous variable that I am using for the histogram is : "QGM"

ggplot(assocID,aes(QGM)) + 
    geom_histogram(binwidth=0.1,aes(fill=category,y(..count..)),position="dodge") +
    facet_wrap(~dyad)

go here to see the three output images. I am a new used to stackoverflow so they won't let me post images. I figure the graphs will explain better than the text!

Now if I use (..count..)/sum(..count..) that just divides by the total count, not within subsets

ggplot(assocID,aes(QGM)) + 
    geom_histogram(binwidth=0.1,aes(fill=category,y(..count..)/sum(..count..)),position="dodge") + 
    facet_wrap(~dyad)

finally the ncount one doesn't seem to do it either.

ggplot(assocID,aes(QGM)) + 
    geom_histogram(binwidth=0.1,aes(fill=category,y(..ncount..)),position="dodge") + 
    facet_wrap(~dyad)

basically I want to show a histogram of frequencies WITHIN each factor of the variable "category"

Any help would be much appreciated!

Upvotes: 1

Views: 1834

Answers (1)

joran
joran

Reputation: 173677

This is tough because your example isn't reproducible, but I'll take a stab that you are looking for ..density.., which per the documentation for stat_bin will yield a value that integrates to one.

Also, I'm assuming that y(..count..) was intended to be y = ..count..?

Upvotes: 3

Related Questions