olive
olive

Reputation: 189

Zoom in on plotly boxplot in python by only showing the whiskers and hide the outliers

I have the following boxplot in plotly for my streamlit application: `

fig = px.box(df, x=x_column, y=y_column, color=x_column)    
st.plotly_chart(fig,use_container_width=True)

I have not found a elegant solution to zoom in on the boxplot and only show the boxplot up until the IQR's and thus hide the outliers in my boxplot. The outliers have extreme values which completely ruins the boxplot presentation. If the outliers would not be shown, the boxplot is readable again.

The boxplot enter image description here

The desired boxplot: enter image description here

Anyone who knows how I can achieve this? Thank you!

Upvotes: 1

Views: 1263

Answers (2)

migueldhr
migueldhr

Reputation: 36

You can use Box from plotly.graph_objects and play with the opacity of the markers. The down side is that you have to show only the boxplot and not the points to its side. Here is the code:

import plotly.graph_objects as go

fig = go.Figure(go.Box(
y=y_column,
marker=dict(opacity=0),# set opacity to 0 (outliers become invisible)
))

fig.update_yaxes(range=[y_min, y_max]) # Set the y range
fig.show()

With opacity = 1 (default): with outliers

With opacity = 0

no outliers

With y range adjusted. no outliers, zoom

You can determine y_min and y_max following Hamzah's answer. The risk in doing so is that they might not be appropriate for your scale. If you see the data I used, it would be weird to limit the y axes to, for example, 253, since we have a 100 step. So, I recommend you to do it by inspection, and set the values that best fits (in my case, 300).

Upvotes: 0

Hamzah Al-Qadasi
Hamzah Al-Qadasi

Reputation: 9786

There is no pre-zooming option in plotly, the only solution is to calculate the Q1 and Q3 and set the range of y-axis as follows:

import plotly.express as px
from scipy import stats

df = px.data.tips()
fig = px.box(df, y="total_bill")

arr = fig['data'][0]['y']

Q1 =  stats.scoreatpercentile(arr, 25)
Q3 =  stats.scoreatpercentile(arr, 75)

IQR = Q3 - Q1

Upper_fence = Q3 + (1.5 * IQR)
Lower_fence = Q1 - (1.5 * IQR)

fig.update_layout( 
    yaxis=dict(
        range=[Lower_fence,Upper_fence]
    ) 
)

fig.show()

Before zooimg:

enter image description here

After adding the zooming option:

enter image description here

You can return to the original plot by click on Autoscale option from the modebar.

Upvotes: 3

Related Questions