Whisker is defined as 1.5* IQR, how could two whikers in plot from python seaborn boxplot different?

Question

According to the seaborn documentation, its boxplot method makes the whiskers 1.5*IQR long. However, as seen in the plot from that documentation, this seems not to be the case. The upper and lower whiskers are not the same. Further it seems not to be 1.5 IQR.

Can someone shed some light on why they are different?

https://seaborn.pydata.org/generated/seaborn.boxplot.html

ImportanceOfBeingErnest · Accepted Answer

In principle the assumption is correct that whiskers on the boxplots should be of equal length if they use a multiple of the interquartile range (IQR).

However there are essentially two cases where this is not true. Unfortunately the english wikipedia version does not tell those reasons, but let me translate the explanation from the german wikipedia:

Whisker
One possible definition, originating from John W. Tukey, is to restrict the length of the whisker to maximally 1.5 times the inter quartile range (1.5*IQR).

In this case the whisker does however not end exactly at this value, but rather at the value from the data which still lies inside of this boundary. The length of the whisker is hence determined by the data and not solemnly by the inter quartile range. This is the reason why the whisker does not need to be of the same size on both ends of the box. If there are no values outside of the 1.5*IQR boundary, the length of the whisker is determined by the minimal and maximal value. Otherwise, the values outside of the whiskers are marked separately in the diagram; those values can then be treated as outliers.

A plot from the same wikipedia page might make this more obvious:

In case of the diagram shown in the question the second reason most certainly applies: Namely that the lower whisker ends at the position of the lowest data value.

Whisker is defined as 1.5* IQR, how could two whikers in plot from python seaborn boxplot different?

Answers (2)

Related Questions