Reputation: 708
I have a series of data that indicates how long ago a certain type of DNA element was active in the genome. It might look something like this:
data.df <- data.frame(name=c("type1", "type1", "type1", "type2", "type2", "type2"),
active=c(9,11,10,21,21,18))
So there are three 'type1' elements active approximately 10 years ago and three type 2 elements active 20 years ago.
I've created a stacked density plot using ggplot2 to get a distribution of when each element was active, something like this:
ggplot(data.df, aes(x=active)) + geom_density(position="stack", aes(fill=name))
I have information for the relative abundances of these elements, and I would like to multiply the height of each elements density by that number. This would end up giving me The actual abundance of activity of these elements in the genome, rather than just a distribution of their activity.
So my question boils down to: How do I transform/multiply the height of each element type's density by some factor, depending on group? For example, if I had 1000 type one elements in the genome and only 3 type 2 elements, the stacked density plot would be dominated by type 1, and you'd hardly see the curve associated with type 2.
I hope this makes sense. Thanks in advance!
Upvotes: 3
Views: 1143
Reputation: 4474
I am not sure if I have understood your question correctly, but is this what you want?
ggplot(data.df)
+geom_density(aes(x=active,y=..scaled..,fill=name),position="stack")
ggplot2
's help under stat_density
says that scaled
gives the "density estimate, scaled to maximum of 1".
Alternatively, you could also add a weight column (e.g., wght
) to your data.frame
, use the weight
argument in geom_density
and ignore the warning message
data.df=data.frame(name=c("type1","type1","type1","type1","type1","type1","type2", "type2","type2"),active=c(1.1,1,1,1,1,1,17.1,17,17),stringsAsFactors =FALSE)
data.df=within(data.df,wght<-c(rep(1/6,6),rep(4/9,3)))
ggplot(data.df)+
geom_density(aes(x=active,y=(..density..),fill=name,weight=wght),position="stack")
However, I do not exactly know how geom_density
handles weights that do not sum up to 1.
Upvotes: 3