Reputation: 365
I would like to create a histogram with a density plot combined in bokeh with a slider filter. Atm, I have the blocks to create a bokeh histogram with a density plot from another thread. I dont know how to create the callback function to update the data and rerender the plot.
from bokeh.io import output_file, show
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg as df
from numpy import histogram, linspace
from scipy.stats.kde import gaussian_kde
pdf = gaussian_kde(df.hp)
x = linspace(0,250,50)
p = figure(plot_height=300)
p.line(x, pdf(x))
# plot actual hist for comparison
hist, edges = histogram(df.hp, density=True, bins=20)
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], alpha=0.4)
show(p)
Upvotes: 0
Views: 1513
Reputation: 579
There are two ways to implement callbacks in Bokeh:
scipy
can't be called from such a callback)Considering you need to refit the kde each time you change the filter condition, the second way is the only option (unless you want to do that in javascript...).
That's how you would do it (example with a filter on cyl
):
from bokeh.application import Application
from bokeh.application.handlers import FunctionHandler
from bokeh.io import output_notebook, show
from bokeh.layouts import column
from bokeh.plotting import figure
from bokeh.models import ColumnDataSource, Select
from bokeh.sampledata.autompg import autompg as df
from numpy import histogram, linspace
from scipy.stats.kde import gaussian_kde
output_notebook()
def modify_doc(doc):
x = linspace(0,250,50)
source_hist = ColumnDataSource({'top': [], 'left': [], 'right': []})
source_kde = ColumnDataSource({'x': [], 'y': []})
p = figure(plot_height=300)
p.line(x='x', y='y', source=source_kde)
p.quad(top='top', bottom=0, left='left', right='right', alpha=0.4, source=source_hist)
def update(attr, old, new):
if new == 'All':
filtered_df = df
else:
condition = df.cyl == int(new)
filtered_df = df[condition]
hist, edges = histogram(filtered_df.hp, density=True, bins=20)
pdf = gaussian_kde(filtered_df.hp)
source_hist.data = {'top': hist, 'left': edges[:-1], 'right': edges[1:]}
source_kde.data = {'x': x, 'y': pdf(x)}
update(None, None, 'All')
select = Select(title='# cyl', value='All', options=['All'] + [str(i) for i in df.cyl.unique()])
select.on_change('value', update)
doc.add_root(column(select, p))
# To run it in the notebook:
plot = Application(FunctionHandler(modify_doc))
show(plot)
# Or to run it stand-alone with `bokeh serve --show myapp.py`
# in which case you need to remove the `output_notebook()` call
# from bokeh.io import curdoc
# modify_doc(curdoc())
A few notes:
output_notebook()
and the last uncommented two lines).Select
will only handle str
values so you need to convert in (when creating it) and out (when using the values: old
and new
)Select
at the same time. You do that by instantiating the Select
s before defining the update
function (but without any callbacks, yet!) and keeping a reference to them, access their value with your_ref.value
and build your condition with that. After the update
definition, you can then attach the callback on each Select
.Finally, an example with multiple selects:
def modify_doc(doc):
x = linspace(0,250,50)
source_hist = ColumnDataSource({'top': [], 'left': [], 'right': []})
source_kde = ColumnDataSource({'x': [], 'y': []})
p = figure(plot_height=300)
p.line(x='x', y='y', source=source_kde)
p.quad(top='top', bottom=0, left='left', right='right', alpha=0.4, source=source_hist)
select_cyl = Select(title='# cyl', value='All', options=['All'] + [str(i) for i in df.cyl.unique()])
select_ori = Select(title='origin', value='All', options=['All'] + [str(i) for i in df.origin.unique()])
def update(attr, old, new):
all = pd.Series(True, index=df.index)
if select_cyl.value == 'All':
cond_cyl = all
else:
cond_cyl = df.cyl == int(select_cyl.value)
if select_ori.value == 'All':
cond_ori = all
else:
cond_ori = df.origin == int(select_ori.value)
filtered_df = df[cond_cyl & cond_ori]
hist, edges = histogram(filtered_df.hp, density=True, bins=20)
pdf = gaussian_kde(filtered_df.hp)
source_hist.data = {'top': hist, 'left': edges[:-1], 'right': edges[1:]}
source_kde.data = {'x': x, 'y': pdf(x)}
update(None, None, 'All')
select_ori.on_change('value', update)
select_cyl.on_change('value', update)
doc.add_root(column(select_ori, select_cyl, p))
Upvotes: 1