Reputation: 115
I know how to nicely split density plots by a binary variable (i.e. sex), but I want to compare and overlay density plots comparing data which contains NA values (in a specified column) and data that doesn't.
I have my data and then create subsets:
data_NA <- data[is.na(data$x4), ]
data_notNA <- data[!is.na(data$x4), ]
I then want to create histograms and density plots of the other variables to see how they they are distributed differently in each subset.
What would I add to compare these histograms easily side-by-side for the different subsets?
sex_hist <- ggplot(data = data) + geom_histogram(mapping = aes(x=factor(sex)), stat="count") + scale_x_discrete(labels = c("1" = "Female", "2" = "Male")) + xlab("Sex")
I could just make two and use grid.arrange()
, but I was hoping there might be a neater way.
And how would I overlay age density plots for the different data subsets for example:
density_DE_age <- ggplot(data = data, aes(x=age, fill = sex)) + geom_density(alpha = 0.5, position = 'identity'))
(Instead of based on sex)
Upvotes: 0
Views: 300
Reputation: 17309
Create a variable indicating whether x4
is missing, then facet by it.
data$x4_missing <- is.na(data$x4)
sex_hist <- ggplot(data = data) +
geom_histogram(mapping = aes(x=factor(sex)), stat="count") +
scale_x_discrete(labels = c("1" = "Female", "2" = "Male")) + \.
xlab("Sex") +
facet_wrap(vars(x4_missing))
density_DE_age <- ggplot(data = data, aes(x=age, fill = sex)) +
geom_density(alpha = 0.5, position = 'identity')) +
facet_wrap(vars(x4_missing))
Upvotes: 1