Reputation: 704
Could anyone help me to plot the data below as a density plot where colour=variable
?
> head(combined_length.m)
length seq mir variable value
1 22 TGAGGTATTAGGTTGTATGGTT mmu-let-7c-5p Ago1 8.622468
2 23 TGAGGGAGTAGGTTGTATGGTTT mmu-let-7c-5p Ago1 22.212471
3 21 TGAGGTAGTAGGTTGCATGGT mmu-let-7c-5p Ago1 9.745199
4 22 TGAGGTAGTATGTTGTATGGTT mmu-let-7c-5p Ago1 11.635982
5 22 TGAGTTAGTAGGTTGTATGGTT mmu-let-7c-5p Ago1 13.203627
6 20 TGAGGTAGTAGGCTGTATGG mmu-let-7c-5p Ago1 7.752571
ggplot(combined_length.m, aes(factor(length),value)) + geom_bar(stat="identity") + facet_grid(~variable) +
theme_bw(base_size=16
I tried this without success:
ggplot(combined_length.m, aes(factor(length),value)) + geom_density(aes(fill=variable), size=2)
Error in data.frame(counts = c(167, 9324, 177, 150451, 62640, 74557, 4, :
arguments imply differing number of rows: 212, 6, 1, 4
I want something like this:
https://i.sstatic.net/qitOs.jpg
Upvotes: 0
Views: 8637
Reputation: 59425
Using factor(length)
for x
seems to create problems. Just use length
.
Also, density plots display the distribution of whatever you define as x
. So by definition the y
axis is the density at a given value of x
. In your code you seem to be trying to specify both x
and y
, which makes no sense. You can specify a y
in geom_density(...)
but this controls the scaling, as shown below. [Note: Your example has only one type of variable
(Ago1) so I created an artificial dataset].
set.seed(1) # for reproducible example
df <- data.frame(variable=rep(LETTERS[1:3],c(5,10,15)),
length =rpois(30,25),
value =rnorm(30,mean=20,sd=5))
library(ggplot2)
ggplot(df,aes(x=length))+geom_density(aes(color=variable))
In this representation, the area under each curve is 1. This is the same as setting y=..density..
ggplot(df,aes(x=length))+geom_density(aes(color=variable,y=..density..))
You can also set y=..count..
which scales based on the counts. In this example, since there are 15 observations for C
and only 5 for A
, the blue curve (C
) has three times the area as the red curve (A
).
ggplot(df,aes(x=length))+geom_density(aes(color=variable,y=..count..))
You can also set y=..scaled..
which adjusts the curves so the maximum value in each corresponds to 1.
ggplot(df,aes(x=length))+geom_density(aes(color=variable,y=..scaled..))
Finally, if you want to get rid of all those annoying extra lines, use stat_density(...)
instead:
ggplot(df,aes(x=length))+
stat_density(aes(color=variable),geom="line",position="identity")
Upvotes: 3