Ahmed Adel
Ahmed Adel

Reputation: 49

Creating a boxplot using bokeh

How to create a boxplot like this one using the bokeh library in python?

df = sns.load_dataset("titanic")
sns.boxplot(x=df["age"])

enter image description here

Upvotes: 0

Views: 2044

Answers (1)

mosc9575
mosc9575

Reputation: 6337

Here is a solution using some random data as input:

import numpy as np
import pandas as pd

from bokeh.plotting import figure, output_notebook, show
output_notebook()

series = pd.Series(list(np.random.randint(0,60,100))+[101]) # one outlier added by hand

Here is the math the boxplot is based on, some quantiles are calculated and the inter quantile range as well as the mean.

qmin, q1, q2, q3, qmax = series.quantile([0, 0.25, 0.5, 0.75, 1])
iqr = q3 - q1
upper = q3 + 1.5 * iqr
lower = q1 - 1.5 * iqr
mean = series.mean()

out = series[(series > upper) | (series < lower)]

if not out.empty:
    outlier = list(out.values)

This stays the same for both solutions.

vertical boxplot

k = 'age'
p = figure(
    tools="save",
    x_range= [k], # enable categorical axes
    title="Boxplot",
    plot_width=400,
    plot_height=500,
)

upper = min(qmax, upper)
lower = max(qmin, lower)

hbar_height = (qmax - qmin) / 500

# stems
p.segment([k], upper, [k], q3, line_color="black")
p.segment([k], lower, [k], q1, line_color="black")

# boxes
p.vbar([k], 0.7, q2, q3, line_color="black")
p.vbar([k], 0.7, q1, q2, line_color="black")

# whiskers (almost-0 height rects simpler than segments)
p.rect([k], lower, 0.2, hbar_height, line_color="black")
p.rect([k], upper, 0.2, hbar_height, line_color="black")

if not out.empty:
    p.circle([k] * len(outlier), outlier, size=6, fill_alpha=0.6)

show(p)

vertical boxplot

horizontal boxplot

To create a horizontal boxplot hbar is used instead of vbar and the order is changes in the segements and in the rects.

k = 'age'
p = figure(
    tools="save",
    y_range= [k],
    title="Boxplot",
    plot_width=400,
    plot_height=500,
)

upper = min(qmax, upper)
lower = max(qmin, lower)

hbar_height = (qmax - qmin) / 500

# stems
p.segment(upper, [k], q3, [k], line_color="black")
p.segment(lower, [k], q1, [k], line_color="black")

# boxes
p.hbar([k], 0.7, q2, q3, line_color="black")
p.hbar([k], 0.7, q1, q2, line_color="black")

# whiskers (almost-0 height rects simpler than segments)
p.rect(lower, [k], 0.2, hbar_height, line_color="black")
p.rect(upper, [k], 0.2, hbar_height, line_color="black")

if not out.empty:
    p.circle(outlier, [k] * len(outlier),  size=6, fill_alpha=0.6)

show(p)

horizontal boxplot

Upvotes: 2

Related Questions