Reputation: 1338
I'm trying to create a bar chart using plotly in python, which is both stacked and grouped.
Toy example (money spent and earned in different years):
import pandas as pd
import plotly.graph_objs as go
data = pd.DataFrame(
dict(
year=[2000,2010,2020],
var1=[10,20,15],
var2=[12,8,18],
var3=[10,17,13],
var4=[12,11,20],
)
)
fig = go.Figure(
data = [
go.Bar(x=data['year'], y=data['var1'], offsetgroup=0, name='spent on fruit'),
go.Bar(x=data['year'], y=data['var2'], offsetgroup=0, base=data['var1'], name='spent on toys'),
go.Bar(x=data['year'], y=data['var3'], offsetgroup=1, name='earned from stocks'),
go.Bar(x=data['year'], y=data['var4'], offsetgroup=1, base=data['var3'], name='earned from gambling'),
]
)
fig.show()
The result seems fine at first:
But watch what happens when I turn off e.g. "spent on fruit":
The "spent on toys" trace remains floating instead of starting from 0.
Can this be fixed? or maybe the whole offsetgroup
+ base
approach won't work here. But what else can I do?
Thanks!
Update: according to this Github issue, stacked, grouped bar plots are being developed for future plotly versions, so this probably won't be an issue anymore.
Upvotes: 9
Views: 17430
Reputation: 2544
Plotly Express (part of recent plotly
library version) offers a facet_col
parameter for its bar chart (and other charts as well), which allows one to set an additional grouping column:
Values from this column or array_like are used to assign marks to facetted subplots in the horizontal direction.
To make it work I had to reshape the example data:
import pandas as pd
data = pd.DataFrame(
dict(
year=[*[2000, 2010, 2020]*4],
var=[*[10, 20, 15], *[12, 8, 18], *[10, 17, 13], *[12, 11, 20]],
names=[
*["spent on fruit"]*3,
*["spent on toys"]*3,
*["earned from stocks"]*3,
*["earned from gambling"]*3,
],
groups=[*["subgroup1"]*6, *["subgroup2"]*6]
)
)
year | var | names | groups | |
---|---|---|---|---|
0 | 2000 | 10 | spent on fruit | subgroup1 |
1 | 2010 | 20 | spent on fruit | subgroup1 |
2 | 2020 | 15 | spent on fruit | subgroup1 |
3 | 2000 | 12 | spent on toys | subgroup1 |
4 | 2010 | 8 | spent on toys | subgroup1 |
5 | 2020 | 18 | spent on toys | subgroup1 |
6 | 2000 | 10 | earned from stocks | subgroup2 |
7 | 2010 | 17 | earned from stocks | subgroup2 |
8 | 2020 | 13 | earned from stocks | subgroup2 |
9 | 2000 | 12 | earned from gambling | subgroup2 |
10 | 2010 | 11 | earned from gambling | subgroup2 |
11 | 2020 | 20 | earned from gambling | subgroup2 |
Once it's in this format (I believe this is called the "tall format") you can plot it with one function call:
import plotly_express as px
fig = px.bar(data, x="groups", y="var", facet_col="year", color="names")
fig.show()
If you want to hide the subgroup labels you can update the x-axis:
fig.update_xaxes(visible=False)
Upvotes: 15
Reputation: 19565
There doesn't seem to be a way to create both stacked and grouped bar charts in Plotly, but there is a workaround that might resolve your issue. You will need to create subgroups, then use a stacked bar in Plotly to plot the bars one at a time, plotting var1
and var2
with subgroup1, and var3
and var4
with subgroup2.
This solution gives you the functionality you want, but changes the formatting and aesthetic of the bar chart. There will be equal spacing between each bar as from Plotly's point of view these are stacked bars (and not grouped bars), and I couldn't figure out a way to eliminate the subgroup1 and subgroup2 text without also getting rid of the years in the x-axis ticks. Any Plotly experts please feel free to chime in and improve my answer!
import pandas as pd
import plotly.graph_objs as go
df = pd.DataFrame(
dict(
year=[2000,2010,2020],
var1=[10,20,15],
var2=[12,8,18],
var3=[10,17,13],
var4=[12,11,20],
)
)
fig = go.Figure()
fig.update_layout(
template="simple_white",
xaxis=dict(title_text="Year"),
yaxis=dict(title_text="Count"),
barmode="stack",
)
groups = ['var1','var2','var3','var4']
colors = ["blue","red","green","purple"]
names = ['spent on fruit','spent on toys','earned from stocks','earned from gambling']
i = 0
for r, n, c in zip(groups, names, colors):
## put var1 and var2 together on the first subgrouped bar
if i <= 1:
fig.add_trace(
go.Bar(x=[df.year, ['subgroup1']*len(df.year)], y=df[r], name=n, marker_color=c),
)
## put var3 and var4 together on the first subgrouped bar
else:
fig.add_trace(
go.Bar(x=[df.year, ['subgroup2']*len(df.year)], y=df[r], name=n, marker_color=c),
)
i+=1
fig.show()
Upvotes: 8