Median statistical difference in ggplot

Question

I have a ggplot boxplot like this one:

library(ggplot2)
data(iris)
ggplot(iris, aes(x = "", y = Sepal.Width)) +
    geom_boxplot()

As you can see the median is 3. Say the real value is 3.8 what I would like to know is if there's a statistical difference among the real value 3.8 and the observed value of 3, so what statistical difference method should I use? Can I implement this in R. Also is it possible to plot the real value of 3.8 in the plot?

Thx!

PS: I´m using the iris dataset as an easily reproducible example for my real data.

Allan Cameron · Accepted Answer

You are looking for a one-sample Wilcoxon signed rank test:

wilcox.test(iris$Sepal.Width, mu = 3.8)
#> 
#>  Wilcoxon signed rank test with continuity correction
#> 
#> data:  iris$Sepal.Width
#> V = 113, p-value < 2.2e-16
#> alternative hypothesis: true location is not equal to 3.8

You can add a horizontal line to the boxplot with geom_hline and a text annotation with geom_text

ggplot(iris, aes(x = "", y = Sepal.Width)) +
  geom_boxplot() + 
  geom_hline(aes(yintercept=3.8), linetype = 2) +
  geom_text(aes(label = "True median", x = 0.5, y = 3.9))

Median statistical difference in ggplot

Answers (2)

Related Questions