Stanyko
Stanyko

Reputation: 195

Dataframe Boxplot in Python displays incorrect whiskers

In this simple example it gives wrong min and max whis.

df = pd.DataFrame(np.array([1,2,3, 4, 5]),
                  columns=['a'])
df.boxplot() 

Outcome:

enter image description here

Following regular formula (Q3 + 1.5 * IQR) it should be 7 and -1, but as seen on pic it's 5 and 1. Looks like formula uses 0.5 instead of 1.5. How can I change back to standard?

Q1 = df['a'].quantile(0.25)
Q2 = df['a'].quantile(0.50)
Q3 = df['a'].quantile(0.75)

print(Q1,Q2, Q3)
IQR = Q3 - Q1
MaxO = (Q3 + 1.5 * IQR)
MinO = (Q1 - 1.5 * IQR)
print("IQR:", IQR, "Max:", MaxO, "Min:" ,MinO)

Outcome:

2.0 3.0 4.0

IQR: 2.0 Max:%: 7.0 Min:% -1.0

(Q1, Q2, Q3 nad IQR are correct, but not Min or Max)

Upvotes: 0

Views: 792

Answers (1)

cmosig
cmosig

Reputation: 1317

Source

From above the upper quartile, a distance of 1.5 times the IQR is measured out and a whisker is drawn up to the largest observed point from the dataset that falls within this distance. Similarly, a distance of 1.5 times the IQR is measured out below the lower quartile and a whisker is drawn up to the lower observed point from the dataset that falls within this distance. All other observed points are plotted as outliers.

Upvotes: 1

Related Questions