Calculating the Upper and lower limits of boxplot statistic (i.e. end of whiskers)

Question

I'm trying to verify the upper and lower limits of the boxplot statistics (i.e. the end of the whiskers) by comparing it to the formula, Q3+(1.5IQR) and Q1-(1.5IQR).

Each time I iterate the following code, it always returns a small difference between the boxplot statistic and the formula.

Shouldn't these numbers be identical? Why the deviation?

# random normal distribution
df <- rnorm(500)
# convert to dataframe
df <- as.data.frame(df)
# boxplot statistics
s <- boxplot.stats(df$df)
s$stats
# Upper limit of whisker: Q3+(1.5*IQR)
s$stats[4]+(1.5*(IQR(df$df)))
# Lower limit of whisker: Q1-(1.5*IQR)
s$stats[2]-(1.5*(IQR(df$df)))

r2evans · Accepted Answer

The whiskers extend out to the data that is at or inside Q3+(1.5*IQR). Meaning, go out to Q3*(1.5*IQR), and then pull it back until it hits data.

We can find those values with:

set.seed(42)
vec <- rnorm(500)
st <- boxplot.stats(vec)
st$stats
# [1] -2.46133548 -0.66263842 -0.03797064  0.63573211  2.45959355


###       ,--- data
###       |   ,--- that is at or inside
###       |  |      ,--- this number
###      ,-, v ,----^---------------------,
max(vec[ vec < st$stats[4]+(1.5*(IQR(vec))) ])
# [1] 2.459594

min(vec[ vec > st$stats[2]-(1.5*(IQR(vec))) ])
# [1] -2.461335

Calculating the Upper and lower limits of boxplot statistic (i.e. end of whiskers)

Answers (1)

Related Questions