eschares
eschares

Reputation: 109

Altair: Facet on a numerical variable with custom groupings

I have a dataset of academic journals and various related measures. One measure is a Tier, 0-15, that represents "how important" a journal is to our campus (imperfectly, but still).

I want to make a series of scatter plots listing the usage (y) vs. cost (x), but then Facet them by Tier. I can get a Faceted chart to work, but it makes 16 separate charts. I found a way to specify the Sort order, but that seems to only take strings (the GOOG, MSFT example).

What I want is to specify Faceting based on groups of Tiers, which are numeric values - have one graph showing the scatterplot of datapoints, but only with data from those journals in Tiers 1-4. Then another with only Tiers 5-8, then Tiers 9-12, and then 13-15. I can't seem to find a way to specify a grouping of continuous values.

CPU_2020_with_1figrTier = alt.Chart(df[filt]).mark_circle(size=75, opacity=0.5).encode(
    alt.X('Total Cost:Q', axis=alt.Axis(format='$,.2r'), scale=alt.Scale(clamp=True)),
    alt.Y('CPU_2020:Q', title='Cost per Use 2020'), #scale=alt.Scale(type='log')
    color=alt.Color('1figr Tier:N'),# scale=subscribed_colorscale),   #Nominal data type
    tooltip=['Title Name','Format','1figr Tier', 'Total Cost', 'CPU_2020', 'Decision'],
    ).interactive().properties(
        height=150,
        title={
            "text": ["CPU_2020 vs. Cost, color-coded by 1figr Tier (where available)"],
            "subtitle": ["Hold"],
            "color": "black",
            "subtitleColor": "gray"
        }
        ).facet(
            row=alt.Row('1figr Tier:N'))

Current output screenshot at https://ibb.co/YPV84nH

Upvotes: 1

Views: 451

Answers (1)

jakevdp
jakevdp

Reputation: 86330

Any time you want to bin a value, you can use the same bin transform that you use to make histograms; the difference is here the binning is not in the x encoding, but in the column encoding.

Here's a quick example demonstrating this that you hopefully can modify to use with your data:

import altair as alt
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'x': np.random.rand(500),
    'y': np.random.randn(500),
    'tier': np.random.randint(0, 15, 500),
})

alt.Chart(df).mark_point().encode(
    x='x:Q',
    y='y:Q',
    column=alt.Column('tier:Q', bin=alt.Bin(minstep=5))
).properties(width=300)

enter image description here

Upvotes: 1

Related Questions