L. Li
L. Li

Reputation: 151

Make overlapping histogram in with geom_histogram

I am trying to make an overlapping histogram like this:

Desired histogram

ggplot(histogram, aes = (x), mapping = aes(x = value)) + 
    geom_histogram(data = melt(tpm_18_L_SD), breaks = seq(1,10,by = 1),
                   aes(y = 100*(..count../sum(..count..))), alpha=0.2) + 
    geom_histogram(data = melt(tpm_18_S_SD), breaks = seq(1,10,by = 1),
                   aes(y = 100*(..count../sum(..count..))), alpha=0.2) + 
    geom_histogram(data = melt(tpm_18_N_SD), breaks = seq(1,10,by = 1),
                   aes(y = 100*(..count../sum(..count..))), alpha=0.2) + 
    facet_wrap(~variable, scales = 'free_x') + 
    ylim(0, 20) +
    ylab("Percentage of Genes") +
    xlab("Standard Deviation")

My code can only make them plot side by side and I would like to also make them overlap. Thank you! I based mine off of the original post where this came from but it did not work for me. It was originally 3 separate graphs which I combined with grid and ggarrange. It looks like this right now.

Here is the code of the three separate graphs.

SD_18_L <- ggplot(data = melt(tpm_18_L_SD), mapping = aes(x = value)) + 
  geom_histogram(aes(y = 100*(..count../sum(..count..))), breaks = seq(1, 10, by = 1)) + 
  facet_wrap(~variable, scales = 'free_x') + 
  ylim(0, 20) +
  ylab("Percentage of Genes") +
  xlab("Standard Deviation")

SD_18_S <- ggplot(data = melt(tpm_18_S_SD), mapping = aes(x = value)) + 
  geom_histogram(aes(y = 100*(..count../sum(..count..))), breaks = seq(1, 10, by = 1)) + 
  facet_wrap(~variable, scales = 'free_x') + 
  ylim(0, 20) +
  ylab("Percentage of Genes") +
  xlab("Standard Deviation")

SD_18_N <- ggplot(data = melt(tpm_18_N_SD), mapping = aes(x = value)) + 
  geom_histogram(aes(y = 100*(..count../sum(..count..))), breaks = seq(1, 10, by = 1)) + 
  facet_wrap(~variable, scales = 'free_x') + 
  ylim(0, 20) +
  ylab("Percentage of Genes") +
  xlab("Standard Deviation")

What my graphs look like now: What my graphs look like now

Upvotes: 3

Views: 3737

Answers (1)

camille
camille

Reputation: 16842

ggplot expects dataframes in a long format. I'm not sure what your data looks like, but you shouldn't have to call geom_histogram for each category. Instead, get all your data into a single dataframe (you can use rbind for this) in long format (what you're doing already with melt) first, then feed it into ggplot and map fill to whatever your categorical variable is.

Your call to facet_wrap is what puts them in 3 different plots. If you want them all on the same plot, take that line out.

An example using the iris data:

ggplot(iris, aes(x = Sepal.Length, fill = Species)) +
    geom_histogram(alpha = 0.6, position = "identity")

I decreased alpha in geom_histogram so you can see where colors overlap, and added position = "identity" so observations aren't being stacked. Hope that helps!

Upvotes: 3

Related Questions