soungalo
soungalo

Reputation: 1338

stacked + grouped bar chart

I'm trying to create a bar chart using plotly in python, which is both stacked and grouped.
Toy example (money spent and earned in different years):

import pandas as pd
import plotly.graph_objs as go

data = pd.DataFrame(
    dict(
        year=[2000,2010,2020],
        var1=[10,20,15],
        var2=[12,8,18],
        var3=[10,17,13],
        var4=[12,11,20],
    )
)

fig = go.Figure(
    data = [
        go.Bar(x=data['year'], y=data['var1'], offsetgroup=0, name='spent on fruit'),
        go.Bar(x=data['year'], y=data['var2'], offsetgroup=0, base=data['var1'], name='spent on toys'),
        go.Bar(x=data['year'], y=data['var3'], offsetgroup=1, name='earned from stocks'),
        go.Bar(x=data['year'], y=data['var4'], offsetgroup=1, base=data['var3'], name='earned from gambling'),
    ]
)
fig.show()   

The result seems fine at first: enter image description here But watch what happens when I turn off e.g. "spent on fruit": enter image description here The "spent on toys" trace remains floating instead of starting from 0.
Can this be fixed? or maybe the whole offsetgroup + base approach won't work here. But what else can I do?
Thanks!

Update: according to this Github issue, stacked, grouped bar plots are being developed for future plotly versions, so this probably won't be an issue anymore.

Upvotes: 9

Views: 17430

Answers (2)

Saaru Lindestøkke
Saaru Lindestøkke

Reputation: 2544

Plotly Express (part of recent plotly library version) offers a facet_col parameter for its bar chart (and other charts as well), which allows one to set an additional grouping column:

Values from this column or array_like are used to assign marks to facetted subplots in the horizontal direction.

To make it work I had to reshape the example data:

import pandas as pd

data = pd.DataFrame(
    dict(
        year=[*[2000, 2010, 2020]*4],
        var=[*[10, 20, 15], *[12, 8, 18], *[10, 17, 13], *[12, 11, 20]],
        names=[
            *["spent on fruit"]*3,
            *["spent on toys"]*3,
            *["earned from stocks"]*3,
            *["earned from gambling"]*3,
        ],
        groups=[*["subgroup1"]*6, *["subgroup2"]*6]
    )
)
year var names groups
0 2000 10 spent on fruit subgroup1
1 2010 20 spent on fruit subgroup1
2 2020 15 spent on fruit subgroup1
3 2000 12 spent on toys subgroup1
4 2010 8 spent on toys subgroup1
5 2020 18 spent on toys subgroup1
6 2000 10 earned from stocks subgroup2
7 2010 17 earned from stocks subgroup2
8 2020 13 earned from stocks subgroup2
9 2000 12 earned from gambling subgroup2
10 2010 11 earned from gambling subgroup2
11 2020 20 earned from gambling subgroup2

Once it's in this format (I believe this is called the "tall format") you can plot it with one function call:

import plotly_express as px

fig = px.bar(data, x="groups", y="var", facet_col="year", color="names")
fig.show()

Plotly express bar chart grouped and stacked

If you want to hide the subgroup labels you can update the x-axis:

fig.update_xaxes(visible=False)

Plotly express bar chart grouped and stacked without x-axis labels

Upvotes: 15

Derek O
Derek O

Reputation: 19565

There doesn't seem to be a way to create both stacked and grouped bar charts in Plotly, but there is a workaround that might resolve your issue. You will need to create subgroups, then use a stacked bar in Plotly to plot the bars one at a time, plotting var1 and var2 with subgroup1, and var3 and var4 with subgroup2.

This solution gives you the functionality you want, but changes the formatting and aesthetic of the bar chart. There will be equal spacing between each bar as from Plotly's point of view these are stacked bars (and not grouped bars), and I couldn't figure out a way to eliminate the subgroup1 and subgroup2 text without also getting rid of the years in the x-axis ticks. Any Plotly experts please feel free to chime in and improve my answer!

import pandas as pd
import plotly.graph_objs as go

df = pd.DataFrame(
    dict(
        year=[2000,2010,2020],
        var1=[10,20,15],
        var2=[12,8,18],
        var3=[10,17,13],
        var4=[12,11,20],
    )
)
        
fig = go.Figure()

fig.update_layout(
    template="simple_white",
    xaxis=dict(title_text="Year"),
    yaxis=dict(title_text="Count"),
    barmode="stack",
)

groups = ['var1','var2','var3','var4']
colors = ["blue","red","green","purple"]
names = ['spent on fruit','spent on toys','earned from stocks','earned from gambling']

i = 0
for r, n, c in zip(groups, names, colors):
    ## put var1 and var2 together on the first subgrouped bar
    if i <= 1:
        fig.add_trace(
            go.Bar(x=[df.year, ['subgroup1']*len(df.year)], y=df[r], name=n, marker_color=c),
        )
    ## put var3 and var4 together on the first subgrouped bar
    else:
        fig.add_trace(
            go.Bar(x=[df.year, ['subgroup2']*len(df.year)], y=df[r], name=n, marker_color=c),
        )
    i+=1

fig.show()   

enter image description here

Upvotes: 8

Related Questions