mindrunner
mindrunner

Reputation: 176

Plot a Trellis Stacked Bar Chart in Altair by combining column values

I'd like to plot a Trellis Stacked Bar Chart graph like in the example Trellis Stacked Bar Chart.

I have this dataset:

pd.DataFrame({
    
    'storage': ['dev01', 'dev01', 'dev01', 'dev02', 'dev02', 'dev03'],
    'project': ['omega', 'alpha', 'beta', 'omega', 'beta', 'alpha'],
    'read': [3, 0, 0, 114, 27, 82],
    'write': [70, 0, 0, 45, 655, 203],
    'read-write': [313, 322, 45, 89, 90, 12]
    
})

  storage project  read  write  read-write
0   dev01   omega     3     70         313
1   dev01   alpha     0      0         322
2   dev01    beta     0      0          45
3   dev02   omega   114     45          89
4   dev02    beta    27    655          90
5   dev03   alpha    82    203          12

What I can't figure out is how to specify the read, write, read-write columns as the colors / values for Altair.

Upvotes: 1

Views: 540

Answers (2)

jakevdp
jakevdp

Reputation: 86433

Your data is wide-form, and must be converted to long-form to be used in Altair encodings. See Long-Form vs. Wide-Form Data in Altair's documentation for more information.

This can be addressed by modifying the input data in Pandas using pd.melt, but it is often more convenient to use Altair's Fold Transform to do this reshaping within the chart specification. For example:

import pandas as pd
import altair as alt

df = pd.DataFrame({
    'storage': ['dev01', 'dev01', 'dev01', 'dev02', 'dev02', 'dev03'],
    'project': ['omega', 'alpha', 'beta', 'omega', 'beta', 'alpha'],
    'read': [3, 0, 0, 114, 27, 82],
    'write': [70, 0, 0, 45, 655, 203],
    'read-write': [313, 322, 45, 89, 90, 12]
})

alt.Chart(df).transform_fold(
    ['read', 'write', 'read-write'],
    as_=['mode', 'value']
).mark_bar().encode(
    x='value:Q',
    y='project:N',
    column='storage:N',
    color='mode:N'
).properties(
    width=200
)

enter image description here

Upvotes: 3

EliadL
EliadL

Reputation: 7088

You need to melt your desired columns into a new column:

# assuming your DataFrame is assigned to `df`

cols_to_melt = ['read', 'write', 'read-write']
cols_to_keep = df.columns.difference(cols_to_melt)

df = df.melt(cols_to_keep, cols_to_melt, 'mode')

So you get the following:

   project storage        mode  value
0    omega   dev01        read      3
1    alpha   dev01        read      0
2     beta   dev01        read      0
3    omega   dev02        read    114
4     beta   dev02        read     27
5    alpha   dev03        read     82
6    omega   dev01       write     70
7    alpha   dev01       write      0
8     beta   dev01       write      0
9    omega   dev02       write     45
10    beta   dev02       write    655
11   alpha   dev03       write    203
12   omega   dev01  read-write    313
13   alpha   dev01  read-write    322
14    beta   dev01  read-write     45
15   omega   dev02  read-write     89
16    beta   dev02  read-write     90
17   alpha   dev03  read-write     12

Then in the altair snippet, instead of color='site', use color='mode'.

Upvotes: 2

Related Questions