bbip
bbip

Reputation: 93

Overlaid histograms in R (ggplot2) with percentage value within each group

The code below poduces three overlaid histograms for each of the groups A, B, C in my dataset:

library(ggplot2)
    set.seed(97531)                                                     
    data <- data.frame(values = c(rnorm(1000, 5, 3),                    
                                  rnorm(1000, 7, 2),
                                  runif(1000, 8, 11)),
                       group = c(rep("A", 1000),
                                 rep("B", 1000),
                                 rep("C", 1000)))
    ggplot(data, aes(x = values, y=100*(..count..)/sum(..count..), fill = group)) +                       
      geom_histogram(position = "identity", alpha = 0.3, bins = 50)+
      ylab("percent")

However, the y axis measures the frequency of a given x value within the entire sample (i.e. groups A + B + C), while I want the y axis to measure the frequency within each subgroup. In other words, I would like to obtain the same result of three overlaid histograms for three different dataframes, one for each group A, B and C.

Upvotes: 2

Views: 332

Answers (1)

TarJae
TarJae

Reputation: 78927

We could subset the data:

ggplot(data,aes(x=values)) + 
  geom_histogram(data=subset(data,group == 'A'),fill = "red", alpha = 0.2) +
  geom_histogram(data=subset(data, group  == 'B'),fill = "blue", alpha = 0.2) +
  geom_histogram(data=subset(data, group == 'C'),fill = "green", alpha = 0.2)

enter image description here

Upvotes: 3

Related Questions