doctorate
doctorate

Reputation: 1413

marking the very end of the two whiskers in each boxplot in ggplot2 in R statistics

My question is as follows: in R package ggplot2 - boxplots - how to mark the two points at the end of the whiskers (the upper and the lower) e.g with a "x" mark so ending up with a boxplot and two additional marks of "x" at the very upper end of the whisker and the other one would be at the very lower end of the lower whisker.

I have searched a lot in the internet for an answer but couldn't find. I could only add "x" mark on the boxplot by using the stat_summary and using mean function data.

How to do the other two points?

To be on the same page please use the mtcars database of R and make boxplot of mpg as y axis and cyl as x axis. Yu will end up with 3 boxplots according to the dataframe mtcars.

According to R

The upper end defined as Q3+1.5*IQR
The lower end defined as Q1-1.5*IQR
Note: IQR = Q3 - Q1

Upvotes: 2

Views: 1377

Answers (1)

csgillespie
csgillespie

Reputation: 60462

You just need to calculate the end points of the boxplots and add them, using stat_summary. For example

##Load the library
library(ggplot2)
data(mpg)

##Create a function to calculate the points
##Probably a built-in function that does this
get_tails = function(x) {
  q1 = quantile(x)[2]
  q3 = quantile(x)[4]
  iqr = q3 -q1
  upper = q3+1.5*iqr
  lower = q1-1.5*iqr
  if(length(x) == 1){return(x)} # will deal with abnormal marks at the periphery of the plot if there is one value only
  ##Trim upper and lower
  up = max(x[x < upper])
  lo = min(x[x > lower])
  return(c(lo, up))
}

Use stat_summary to add it to your plot:

ggplot(mpg, aes(x=drv,y=hwy)) + geom_boxplot() + 
  stat_summary(geom="point", fun.y= get_tails, colour="Red")

Also, your definition of the end points isn't quite correct. See my answer to another question for a few more details.

Upvotes: 2

Related Questions