bpar
bpar

Reputation: 392

ggplot boxplot but with boxes extending to 5th and 95th percentiles

I would like to have a boxplot summarize the distribution of some underlying data, but in place of the whiskers extending to the 5th and 95th, I would like the boxes to extend to the 5th and 95th.

Standard boxplot with outliers and whiskers removed:

library("ggplot2")
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_boxplot(outlier.shape = NA, coef = 0)

enter image description here Boxplot with whiskers at 5th and 95th:

p + stat_summary(geom = "boxplot", 
                 fun.data = function(x) setNames(quantile(x, c(0.05, 0.25, 0.5, 0.75, 0.95)), 
                                                 c("ymin", "lower", "middle", "upper", "ymax")))

enter image description here

But what I really want is the boxes (no whiskers) to extend to the 5th and 95th, so a combination of both of these modifications. Is there a way to specify the box-generating function in stat_summary()?

Upvotes: 1

Views: 1543

Answers (2)

bpar
bpar

Reputation: 392

stat_summary() allows the user to specify the ymin, lower, middle, upper, and ymax. By simply modifying the original code to make the ymin = lower, and ymax = upper, the plot will extend the lower end of the boxes to the specified percentile.

## preamble
library("ggplot2")
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_boxplot(outlier.shape = NA, coef = 0)

## original with whiskers extending to the 5th and 95th
p + stat_summary(geom = "boxplot", 
             fun.data = function(x) setNames(quantile(x, c(0.05, 0.25, 0.5, 0.75, 0.95)), 
                                             c("ymin", "lower", "middle", "upper", "ymax")))

original range with whiskers extending to 0.05 and 0.95

## modified range with boxes extending to the 5th and 95th (no whiskers)
p + stat_summary(geom = "boxplot", 
             fun.data = function(x) setNames(quantile(x, c(0.05, 0.05, 0.5, 0.95, 0.95)), 
                                             c("ymin", "lower", "middle", "upper", "ymax")))

modified range with boxesextending to 0.05 and 0.95

Upvotes: 0

colonelforbin97
colonelforbin97

Reputation: 105

This might be a slightly "hacky" method of doing it, but the easiest way might be to just use geom_segment for each cylinder class. This will allow you to specify the width of the boxplot and the values that you want the boxplot to reach. But you could then play around with aes() and also add in a median line using stat_summary() if desired.

library(ggplot2)
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_segment(aes(x = 4, xend = 4, y = quantile(subset(mtcars, mtcars$cyl==4)$mpg,0.95), yend = quantile(subset(mtcars, mtcars$cyl==4)$mpg, 0.05)), color = 'firebrick1', lwd = 28)

Upvotes: 0

Related Questions