Kurt Sopa
Kurt Sopa

Reputation: 23

Calculating the Upper and lower limits of boxplot statistic (i.e. end of whiskers)

I'm trying to verify the upper and lower limits of the boxplot statistics (i.e. the end of the whiskers) by comparing it to the formula, Q3+(1.5IQR) and Q1-(1.5IQR).

Each time I iterate the following code, it always returns a small difference between the boxplot statistic and the formula.

Shouldn't these numbers be identical? Why the deviation?

# random normal distribution
df <- rnorm(500)
# convert to dataframe
df <- as.data.frame(df)
# boxplot statistics
s <- boxplot.stats(df$df)
s$stats
# Upper limit of whisker: Q3+(1.5*IQR)
s$stats[4]+(1.5*(IQR(df$df)))
# Lower limit of whisker: Q1-(1.5*IQR)
s$stats[2]-(1.5*(IQR(df$df)))

Upvotes: 2

Views: 1008

Answers (1)

r2evans
r2evans

Reputation: 160447

The whiskers extend out to the data that is at or inside Q3+(1.5*IQR). Meaning, go out to Q3*(1.5*IQR), and then pull it back until it hits data.

We can find those values with:

set.seed(42)
vec <- rnorm(500)
st <- boxplot.stats(vec)
st$stats
# [1] -2.46133548 -0.66263842 -0.03797064  0.63573211  2.45959355


###       ,--- data
###       |   ,--- that is at or inside
###       |  |      ,--- this number
###      ,-, v ,----^---------------------,
max(vec[ vec < st$stats[4]+(1.5*(IQR(vec))) ])
# [1] 2.459594

min(vec[ vec > st$stats[2]-(1.5*(IQR(vec))) ])
# [1] -2.461335

Upvotes: 2

Related Questions