Reputation: 247
I was doing some boxplots with ggplot in R and I wonder why it does not show the error bar for just one boxplot ?
The code was simply this one :
ID1.4.5.6.7[,"Time"] <- as.factor(ID1.4.5.6.7[,"Time"])
ggplot(data=ID1.4.5.6.7,aes(x=Time, y=mRNA, fill=Time)) +
geom_boxplot(notch = TRUE) +
stat_boxplot(geom="errorbar")+
labs(title="mRNA vs Time", subtitle="Irradiated",x = "Time [min]",y = "mRNA")+
theme(plot.title = element_text(hjust = 0.5),plot.subtitle = element_text(hjust = 0.5))
I do not know if it is a problem related to the code or it is not a problem but just something related to the data
structure(list(Gene = c("ID-1", "ID-1", "ID-1", "ID-1", "ID-1",
"ID-1", "ID-1", "ID-1", "ID-1", "ID-1", "ID-1", "ID-1", "ID-1",
"ID-1", "ID-1", "ID-4", "ID-4", "ID-4", "ID-4", "ID-4", "ID-4",
"ID-4", "ID-4", "ID-4", "ID-4", "ID-4", "ID-4", "ID-4", "ID-4",
"ID-4", "ID-4", "ID-5", "ID-5", "ID-5", "ID-5", "ID-5", "ID-5",
"ID-5", "ID-5", "ID-5", "ID-5", "ID-5", "ID-5", "ID-5", "ID-5",
"ID-5", "ID-5", "ID-5", "ID-5", "ID-6", "ID-6", "ID-6", "ID-6",
"ID-6", "ID-6", "ID-6", "ID-6", "ID-6", "ID-6", "ID-6", "ID-6",
"ID-6", "ID-6", "ID-6", "ID-6", "ID-6", "ID-6", "ID-7", "ID-7",
"ID-7", "ID-7", "ID-7", "ID-7", "ID-7", "ID-7", "ID-7", "ID-7",
"ID-7", "ID-7", "ID-7", "ID-7", "ID-7", "ID-7", "ID-7", "ID-7"
), mRNA = c(-0.181385669, -0.059647494, 0.104476117, -0.052190978,
-0.040484945, 0.194226742, -0.501601326, 0.102342605, -0.127143845,
-0.008523742, -0.102946211, -0.042894028, 0.002922923, -0.134394347,
-0.214204393, -0.138122686, 0.203242361, 0.097935502, 0.147068146,
-0.089430917, 0.331565412, -0.034572422, -0.129896329, 0.324191,
0.470108479, -0.027268223, 0.232304713, 0.090348708, 0.070848402,
0.181540708, -0.502255367, -0.267631441, -0.368647839, -0.040910404,
-0.003983171, -0.003983171, -0.003983171, -0.14980589, -0.119449612,
-0.309154214, -0.487589361, 0.272803506, -0.421733575, -0.467108567,
0.024868338, -0.156025729, -0.044680175, -0.206716896, -0.272014193,
-0.230499883, -0.238597397, -0.118130949, 0.349957464, 0.349957464,
0.349957464, 0.172048587, -0.186226994, 0.16113822, -0.293029136,
-0.111636253, -0.044189887, 0.081555274, -0.048106079, -0.05853566,
0.010407814, -0.066981809, -0.09828484, -0.315190986, -0.005102456,
0.221556197, 0.206584568, 0.206584568, 0.206584568, 0.102649006,
-0.011777384, -0.36963487, -0.054853074, -0.230240699, -0.210508323,
-0.208889919, -0.050763372, 0.023073782, -0.095118984, -0.091076071,
-0.330257395), Time = structure(c(2L, 2L, 2L, 3L, 3L, 2L, 3L,
3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 2L, 2L, 2L, 3L, 3L, 2L, 3L, 3L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 2L, 2L, 2L, 1L, 1L, 1L, 3L, 3L,
2L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 2L, 2L, 2L, 1L, 1L, 1L,
3L, 3L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 2L, 2L, 2L, 1L,
1L, 1L, 3L, 2L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), .Label = c("0",
"20", "40", "60", "120"), class = "factor"), predicted_mRNA = c(-0.00551000342030954,
-0.00551000342030954, -0.00551000342030954, -0.0302695238715682,
-0.0302695238715682, -0.00551000342030954, -0.0302695238715682,
-0.0302695238715682, -0.0550290443228268, -0.0550290443228268,
-0.0550290443228268, -0.0550290443228268, -0.129307605676603,
-0.129307605676603, -0.129307605676603, -0.00551000342030954,
-0.00551000342030954, -0.00551000342030954, -0.0302695238715682,
-0.0302695238715682, -0.00551000342030954, -0.0302695238715682,
-0.0302695238715682, -0.0550290443228268, -0.0550290443228268,
-0.0550290443228268, -0.0550290443228268, -0.129307605676603,
-0.129307605676603, -0.129307605676603, -0.129307605676603, -0.00551000342030954,
-0.00551000342030954, -0.00551000342030954, 0.0192495170309491,
0.0192495170309491, 0.0192495170309491, -0.0302695238715682,
-0.0302695238715682, -0.00551000342030954, -0.0302695238715682,
-0.0302695238715682, -0.0550290443228268, -0.0550290443228268,
-0.0550290443228268, -0.129307605676603, -0.129307605676603,
-0.129307605676603, -0.129307605676603, -0.00551000342030954,
-0.00551000342030954, -0.00551000342030954, 0.0192495170309491,
0.0192495170309491, 0.0192495170309491, -0.0302695238715682,
-0.0302695238715682, -0.00551000342030954, -0.0302695238715682,
-0.0302695238715682, -0.0550290443228268, -0.0550290443228268,
-0.0550290443228268, -0.0550290443228268, -0.129307605676603,
-0.129307605676603, -0.129307605676603, -0.00551000342030954,
-0.00551000342030954, -0.00551000342030954, 0.0192495170309491,
0.0192495170309491, 0.0192495170309491, -0.0302695238715682,
-0.00551000342030954, -0.0302695238715682, -0.0302695238715682,
-0.0550290443228268, -0.0550290443228268, -0.0550290443228268,
-0.0550290443228268, -0.129307605676603, -0.129307605676603,
-0.129307605676603, -0.129307605676603)), row.names = c(NA, -85L
), class = "data.frame")
Here is the dput(ID1.4.5.6.7) so the dataframe.
Upvotes: 1
Views: 2570
Reputation: 39613
I would suggest this approach where you can enable varwidth
in order to see the error bar. Here the code:
#Plot
ggplot(data=ID1.4.5.6.7,aes(x=Time, y=mRNA, fill=Time)) +
geom_boxplot(varwidth = TRUE,notch=TRUE) +
stat_boxplot(geom="errorbar")+
labs(title="mRNA vs Time", subtitle="Irradiated",x = "Time [min]",y = "mRNA")+
theme(plot.title = element_text(hjust = 0.5),plot.subtitle = element_text(hjust = 0.5))
Output:
Upvotes: 2
Reputation: 11
Because boxplots do not have error bars. A boxplot is just a graphical representation of five numbers: Minimum, Q1 (1st quartile), Median, Q3 (3rd quartile), and Maximum. The whiskers (the "bars" going up and down) are just lines ending in the minimum value in the data (the lower one) and the maximum value (the upper one). The bottom edge of the "box" is Q1 and the top edge is Q3.
It is possible for a group of data to be arranged such that the minimum is the same as Q1 and the maximum is the same as Q3. More or less, that seems to be what's happening in the boxplot that has no whiskers. ggplot has some extra details added to the boxplot (the "waist" that is pulled in, and an algorithmic tweak that leads to the possibility of the inversion you see at the top of the Time 0 group), but more or less that seems to be what's happening.
Edit: This seems to be a question about code, but it's actually about statistics. It might be better for crossvalidated (though I think it is probably sufficiently answered, now).
Upvotes: 0