BioMan
BioMan

Reputation: 704

Plot barplot as density plot in ggplot

Could anyone help me to plot the data below as a density plot where colour=variable?

> head(combined_length.m)
  length                     seq           mir variable     value
1     22  TGAGGTATTAGGTTGTATGGTT mmu-let-7c-5p     Ago1  8.622468
2     23 TGAGGGAGTAGGTTGTATGGTTT mmu-let-7c-5p     Ago1 22.212471
3     21   TGAGGTAGTAGGTTGCATGGT mmu-let-7c-5p     Ago1  9.745199
4     22  TGAGGTAGTATGTTGTATGGTT mmu-let-7c-5p     Ago1 11.635982
5     22  TGAGTTAGTAGGTTGTATGGTT mmu-let-7c-5p     Ago1 13.203627
6     20    TGAGGTAGTAGGCTGTATGG mmu-let-7c-5p     Ago1  7.752571

ggplot(combined_length.m, aes(factor(length),value)) + geom_bar(stat="identity") + facet_grid(~variable) +
  theme_bw(base_size=16

I tried this without success:

ggplot(combined_length.m, aes(factor(length),value)) + geom_density(aes(fill=variable), size=2)

Error in data.frame(counts = c(167, 9324, 177, 150451, 62640, 74557, 4,  : 
  arguments imply differing number of rows: 212, 6, 1, 4

enter image description here

I want something like this:

https://i.sstatic.net/qitOs.jpgenter image description here

Upvotes: 0

Views: 8637

Answers (1)

jlhoward
jlhoward

Reputation: 59425

Using factor(length) for x seems to create problems. Just use length.

Also, density plots display the distribution of whatever you define as x. So by definition the y axis is the density at a given value of x. In your code you seem to be trying to specify both x and y, which makes no sense. You can specify a y in geom_density(...) but this controls the scaling, as shown below. [Note: Your example has only one type of variable (Ago1) so I created an artificial dataset].

set.seed(1)   # for reproducible example
df <- data.frame(variable=rep(LETTERS[1:3],c(5,10,15)),
                 length  =rpois(30,25),
                 value   =rnorm(30,mean=20,sd=5))

library(ggplot2)
ggplot(df,aes(x=length))+geom_density(aes(color=variable))

In this representation, the area under each curve is 1. This is the same as setting y=..density..

ggplot(df,aes(x=length))+geom_density(aes(color=variable,y=..density..))

You can also set y=..count.. which scales based on the counts. In this example, since there are 15 observations for C and only 5 for A, the blue curve (C) has three times the area as the red curve (A).

ggplot(df,aes(x=length))+geom_density(aes(color=variable,y=..count..))

You can also set y=..scaled.. which adjusts the curves so the maximum value in each corresponds to 1.

ggplot(df,aes(x=length))+geom_density(aes(color=variable,y=..scaled..))

Finally, if you want to get rid of all those annoying extra lines, use stat_density(...) instead:

ggplot(df,aes(x=length))+
  stat_density(aes(color=variable),geom="line",position="identity")

Upvotes: 3

Related Questions