Reputation: 41
I have a simple data set with two groups and a value for each group at 4 different time points. I want to display this data set as grouped boxplots over time but ggplot2 doesn't separate the time points.
This is my data:
matrix
Replicate Line Day Treatment X A WT Marker Proportion
1 C 10 low NA HuCHuD_Pos 8.62
2 C 10 low NA HuCHuD_Pos NA
1 C 18 low NA HuCHuD_Pos 30.50
3 C 18 low NA HuCHuD_Pos NA
2 C 18 low NA HuCHuD_Pos NA
1 C 50 low NA HuCHuD_Pos 26.10
2 C 50 low NA HuCHuD_Pos 31.90
1 C 80 low NA HuCHuD_Pos 12.70
2 C 80 low NA HuCHuD_Pos 26.20
1 C 10 normal NA HuCHuD_Pos NA
2 C 10 normal NA HuCHuD_Pos 17.20
1 C 18 normal NA HuCHuD_Pos 3.96
2 C 18 normal NA HuCHuD_Pos NA
1 C 50 normal NA HuCHuD_Pos 25.60
2 C 50 normal NA HuCHuD_Pos 17.50
1 C 80 normal NA HuCHuD_Pos 19.00
NA C 80 normal NA HuCHuD_Pos NA
And this is my code:
matrix = as.data.frame(subset(data.long, Line == line_single & Marker == marker_single & Day != "30"))
pdf(paste(line_name_single, marker_name_single, ".pdf"), width=10, height=10)
plot <-
ggplot(data=matrix,aes(x=Day, y=Proportion, group=Treatment, fill=Treatment)) +
geom_boxplot(position=position_dodge(1))
print(plot)
dev.off()
What do I do wrong?
What I want
What I get
Thanks very much for your help!
Cheers, Paula
Upvotes: 4
Views: 6095
Reputation: 6483
This is how a minimal reproducible example for your question could look like:
matrix <- structure(list(Day = c(10L, 10L, 18L, 18L, 18L, 50L, 50L, 80L, 80L, 10L, 10L, 18L, 18L, 50L, 50L, 80L, 80L),
Treatment = c("low", "low", "low", "low", "low", "low", "low", "low", "low", "normal", "normal", "normal", "normal", "normal", "normal", "normal", "normal"),
Proportion = c(8.62, NA, 30.5, NA, NA, 26.1, 31.9, 12.7, 26.2, NA, 17.2, 3.96, NA, 25.6, 17.5, 19, NA)),
class = "data.frame", row.names = c(NA, -17L))
Suggested answer using factor
to 'discretisize' the variable Day
:
ggplot(data=matrix,aes(x=factor(Day), y=Proportion, fill=Treatment)) +
geom_boxplot(position=position_dodge(1)) +
labs(x ="Day")
Explanation: If we pass a continuous variable to the 'x' axis for a box-plot, ggplot2
does not convert the axis to a discrete variable. Therefore, in lack of a 'grouping' variable we only get one box. But if we convert the variable to something discrete, like a factor, a string or a date, we get the desired behavior.
Also, when you use dput
or one of the techniques described here it's way easier to find and test an answer than having to try and work with the data description as in the question (or at least I couldn't figure out how to load that example data)
P.S. I think it's a bit confusing to name a variable of class data.frame
'matrix' since matrix
is its own data type in R... ;)
Upvotes: 5