Olshansky
Olshansky

Reputation: 6404

Altair - Use the same axis for binned data and vertical annotation lines

I have a data source that I'm trying to bin and build a histogram out of. (Note that the data below is just as an example of the post-processed bin data).

My goal is to draw vertical lines to annotate different parts of the axis.

I got relatively close following other StackOverflow answers but the problem is that the axis for the vertical lines is separate from the axis for the binned data. My guess is that this is because the x values for the vertical lines are quantitative while the binned data is categorial.

Is there any way to have the vertical lines align with the x-axis on the bottom?

data_bar = pd.DataFrame({
'bin': [0.78,0.82,0.88,0.92,0.98,1.02,1.08,1.12,1.18,1.23,1.27,1.32,1.38],
'freq': [0,3,18,95,279,416,660,411,263,200,53,22,0]
})
data_bar['bin'] = data_bar['bin'].astype('category')

data_lines = pd.DataFrame({
    'value': [0.8, 0.88, 1.001, 1.38],
    'title': ['no_match', 'match', 'no_match', 'match']     
})

bar = alt.Chart(data_bar).mark_bar().encode(x='bin', y='freq')
  
vertlines = alt.Chart(data_lines).mark_rule(
    color='black',
    strokeWidth=2
).encode(x='value')

text = alt.Chart(data_lines).mark_text(
    align='left', dx=5, dy=-5
).encode(
    x='value', text='title')


alt.layer(bar + vertlines + text).properties(width=500)

For reference, here is the graph in a vega editor.

Upvotes: 1

Views: 186

Answers (1)

jakevdp
jakevdp

Reputation: 86330

You need to plot your binned data on a quantitative axis, which you can do by setting bin='binned' and adding an x2 encoding to specify the upper limit of each bin. Here are the required modifications to the data frame and the bar chart; the rest can stay the same:

data_bar = pd.DataFrame({
'bin': [0.78,0.82,0.88,0.92,0.98,1.02,1.08,1.12,1.18,1.23,1.27,1.32,1.38],
'freq': [0,3,18,95,279,416,660,411,263,200,53,22,0]
})
data_bar['bin_max'] = data_bar['bin'].shift(-1).fillna(data_bar['bin'].max() + 0.05)

# Note: don't convert data['bin'] to category

bar = alt.Chart(data_bar).mark_bar().encode(
    x=alt.X('bin', bin='binned'),
    x2='bin_max',
    y='freq')

Here is the resulting chart: enter image description here

Upvotes: 1

Related Questions