Reputation: 189
I have seen example1 and How to overlay density plots in R? and Overlapped density plots in ggplot2 about how to make density plot. I can make a density plot with the codes in the second link. However I am wondering how can I make such a graph in ggplot
or plotly
?
I have looked at all the examples but cannot figure it out for my problem.
I have a toy data frame with gene expression leukemia data description, which columns in it refers to 2 groups of individuals
leukemia_big <- read.csv("http://web.stanford.edu/~hastie/CASI_files/DATA/leukemia_big.csv")
df <- data.frame(class= ifelse(grepl("^ALL", colnames(leukemia_big),
fixed = FALSE), "ALL", "AML"), row.names = colnames(leukemia_big))
plot(density(as.matrix(leukemia_big[,df$class=="ALL"])),
lwd=2, col="red")
lines(density(as.matrix(leukemia_big[,df$class=="AML"])),
lwd=2, col="darkgreen")
Upvotes: 3
Views: 2068
Reputation: 5898
Ggplot requires tidy formated data, also known as a long formatted dataframe. The following example will do it. But be carefull, the provided dataset has an almost identical distribution of values by type of patient, thus when you plot ALL and AML type of patients, the curves overlap and you can not see the difference.
library(tidyverse)
leukemia_big %>%
as_data_frame() %>% # Optional, makes df a tibble, which makes debugging easier
gather(key = patient, value = value, 1:72) %>% #transforms a wide df into a tidy or long df
mutate(type = gsub('[.].*$','', patient)) %>% #creates a variable with the type of patient
ggplot(aes(x = value, fill = type)) + geom_density(alpha = 0.5)
In this second example I will add 1 unit to the value variable for all AML type of patients, to visually demonstrate the overlapping problem
leukemia_big %>%
as_data_frame() %>% # Optional, makes df a tibble, which makes debugging easier
gather(key = patient, value = value, 1:72) %>% #transforms a wide df into a tidy or long df
mutate(type = gsub('[.].*$','', patient)) %>% #creates a variable with the type of patient
mutate(value2 = if_else(condition = type == "ALL", true = value, false = value + 1)) %>% # Helps demonstrate the overlapping between both type of patients
ggplot(aes(x = value2, fill = type)) + geom_density(alpha = 0.5)`
Upvotes: 6