python - bokeh - stacked bar chart with conditional coloring

Question

how to detach height of the stacked bars from colors of the fill?

I have multiple categories which I want to present in stacked bar chart so that the height represent the value and color is conditionally defined by another variable (something like fill= in the ggplot ).

I am new to bokeh and struggling with the stack bar chart mechanics. I tried construct this type of chart, but I haven't got anything except all sorts of errors. The examples of stacked bar chart are very limited in the bokeh documentation.

My Data is stored in pandas dataframe:

data =
['A',1, 15, 1]
'A',2, 14, 2
'A',3, 60, 1
'B',1, 15, 2
'B',2, 25, 2
'B',3, 20, 1
'C',1, 15, 1
'C',2, 25, 1
'C',3, 55, 2
...
]

Columns represent Category, Regime, Value, State.

I want to plot Category on x axis, Regimes stacked on y axis where bar length represents Value and color represents State.

is this achievable in bokeh? can anybody demonstrate please

syntonym · Accepted Answer

I think this problem becomes much easier if you transform your data to the following form:

from bokeh.plotting import figure
from bokeh.io import show
from bokeh.transform import stack, factor_cmap
import pandas as pd

df = pd.DataFrame({
    "Category": ["a", "b"],
    "Regime1_Value": [1, 4], 
    "Regime1_State": ["A", "B"],
    "Regime2_Value": [2, 5], 
    "Regime2_State": ["B", "B"],
    "Regime3_Value": [3, 6], 
    "Regime3_State": ["B", "A"]})

p = figure(x_range=["a", "b"])
p.vbar_stack(["Regime1_Value", "Regime2_Value", "Regime3_Value"],
        x="Category",
        fill_color=[
            factor_cmap(state, palette=["red", "green"], factors=["A", "B"]) 
            for state in ["Regime1_State","Regime2_State", "Regime3_State"]],
        line_color="black",
        width=0.9,
        source=df)

show(p)

This is a bit strange, because vbar_stack behaves unlike a "normal glyph". Normally you have three options for attributes of a renderer (assume we want to plot n dots/rectangles/shapes/things:

Give a single value that is used for all n glyphs
Give a column name that is looked up in the source (source[column_name] must produce an "array" of length n)
Give an array of length n of data

But vbar_stack does not create one renderer, it creates as many as there are elements in the first array you give. Lets call this number k. Then to make sense of the attributes you have again three options:

Give a single value that is used for all glyphs
Give an array of k things that are used as columns names in the source (each lookup must produce an array of length n).
Give an array of length n of data (so for all 1-k glyphs have the same data).

So p.vbar(x=[a,b,c]) and p.vbar_stacked(x=[a,b,c]) actually do different things (the first gives literal data, the second gives column names) which confused, and it's not clear from the documentation.

But why do we have to transform your data so strangely? Lets unroll vbar_stack and write it on our own (details left out for brevity):

plotted_regimes = []

for regime in regimes: if not plotted_regimes: bottom = 0 else: bottom = stack(*plotted_regimes) p.vbar(bottom=bottom, top=stack(*plotted_regimes, regime)) plotted_regimes.append(regime)

So for each regime we have a separate vbar that has its bottom where the sum of the other regimes ended. Now with the original data structure this is not really possible because there doesn't need to be a a value for each regime for each category. Here we are forced to set these values to 0 if we actually want.

Because the stacked values corrospond to column names we have to put these values in one dataframe. The vbar_stack call in the beginning could also be written with stack (basically because vbar_stack is a convenience wrapper around stack).

The factor_cmap is used so that we don't have to manually assign colors. We could also simply add a Regime1_Color column, but this way the mapping is done automatically (and client side).

python - bokeh - stacked bar chart with conditional coloring

Answers (1)

Related Questions