Reputation: 25
I'm building Bokeh dashboard with country data to dynamically change graph for a line graph.
User are able to select multiple countries using CheckboxGroup.
I am able to subset source table dynamically as I select/deselect countries.
After I subset, I am aggregating source table for graph where problem accurs. (group all countries by date)
I understand that we have to directly use source=src
but I need to aggregate every time I update new source.
Is there any suggestion on how I can approach this issue?
Thanks!
def make_plot(src):
temp = pd.DataFrame.from_dict(src.data)
agg_date_full = ColumnDataSource(temp.groupby('date').sum().reset_index())
fig1.line('date', 'y',source=agg_date_full)
def update(attr, old, new):
country_to_plot = [country_checkbox.labels[i] for i in country_checkbox.active]
new_src = make_dataset(country_to_plot)
src.data.update(new_src.data)
country_checkbox = CheckboxGroup(labels=country_labels, active= list(range(0,len(country_labels))))
country_checkbox.on_change('active', update)
initial_countries = [country_checkbox.labels[i] for i in country_checkbox.active]
src = make_dataset(initial_countries)
p = make_plot(src)
Upvotes: 1
Views: 548
Reputation: 1446
The answer depends on how you plan to deploy and use your dashboard.
If you can run a bokeh
server then it's fairly straightforward to achieve the dynamic transformation of the data you describe.
We can get an example timeseries dataset with multiple countries from the World Bank using their API. From your description it should be close enough:
http://api.worldbank.org/v2/country/eas;ecs;lcn;mea;nac;sas;ssf/indicator/EN.ATM.CO2E.KT?source=2&downloadformat=csv
After a little bit of tidying up, the dataframe should look like this:
Country Name Year Value
East Asia & Pacific 1960 1.215380e+06
Europe & Central Asia 1960 4.583646e+06
Latin America & Caribbean 1960 3.024539e+05
Middle East & North Africa 1960 9.873685e+04
North America 1960 3.083749e+06
...
Now the bokeh
code. I've used the single module approach from the docs, but you can make it as complex as you'd like. Note that you should put this code in .py
file, not run it from a Jupyter notebook
from bokeh.layouts import row
from bokeh.models import CheckboxGroup, NumeralTickFormatter
from bokeh.plotting import figure, curdoc
initial_x = df["Year"].unique()
initial_y = (
df[df["Country Name"] == "Europe & Central Asia"]
.groupby("Year")["Value"]
.sum()
.values
)
# create a plot and style its properties
p = figure(height=400, width=600, toolbar_location=None)
p.yaxis[0].formatter = NumeralTickFormatter(format="0.0a")
p.yaxis.axis_label = "CO2 emissions (kt)"
p.xaxis.axis_label = "Years"
# create line renderer
line = p.line(x=initial_x, y=initial_y, line_width=2)
ds = line.data_source
# create a callback that will reset the datasource
def callback(self):
selected = [checkbox_group.labels[i] for i in checkbox_group.active]
filtered = df[df["Country Name"].isin(selected)]
new_data = dict()
new_x = filtered["Year"].unique()
new_y = filtered.groupby("Year")["Value"].sum().values
new_data["x"] = new_x
new_data["y"] = new_y
ds.data = new_data
# add checkboxes and the callback
labels = list(df["Country Name"].unique())
checkbox_group = CheckboxGroup(labels=labels, active=[1])
checkbox_group.on_click(callback)
# put the checkboxes and plot in a layout and add to the document
curdoc().add_root(row(checkbox_group, p))
Now, when you run the following command from your terminal: bokeh serve --show app.py
, you will be able to see your dashboard in the browser, like so:
When you click on different regions, their carbon emissions will be added up and plotted as one line.
Upvotes: 1