Reputation: 6404
I have a data source that I'm trying to bin and build a histogram out of. (Note that the data below is just as an example of the post-processed bin data).
My goal is to draw vertical lines to annotate different parts of the axis.
I got relatively close following other StackOverflow answers but the problem is that the axis for the vertical lines is separate from the axis for the binned data. My guess is that this is because the x values
for the vertical lines are quantitative while the binned data is categorial.
Is there any way to have the vertical lines align with the x-axis
on the bottom?
data_bar = pd.DataFrame({
'bin': [0.78,0.82,0.88,0.92,0.98,1.02,1.08,1.12,1.18,1.23,1.27,1.32,1.38],
'freq': [0,3,18,95,279,416,660,411,263,200,53,22,0]
})
data_bar['bin'] = data_bar['bin'].astype('category')
data_lines = pd.DataFrame({
'value': [0.8, 0.88, 1.001, 1.38],
'title': ['no_match', 'match', 'no_match', 'match']
})
bar = alt.Chart(data_bar).mark_bar().encode(x='bin', y='freq')
vertlines = alt.Chart(data_lines).mark_rule(
color='black',
strokeWidth=2
).encode(x='value')
text = alt.Chart(data_lines).mark_text(
align='left', dx=5, dy=-5
).encode(
x='value', text='title')
alt.layer(bar + vertlines + text).properties(width=500)
For reference, here is the graph in a vega editor.
Upvotes: 1
Views: 186
Reputation: 86330
You need to plot your binned data on a quantitative axis, which you can do by setting bin='binned'
and adding an x2
encoding to specify the upper limit of each bin. Here are the required modifications to the data frame and the bar chart; the rest can stay the same:
data_bar = pd.DataFrame({
'bin': [0.78,0.82,0.88,0.92,0.98,1.02,1.08,1.12,1.18,1.23,1.27,1.32,1.38],
'freq': [0,3,18,95,279,416,660,411,263,200,53,22,0]
})
data_bar['bin_max'] = data_bar['bin'].shift(-1).fillna(data_bar['bin'].max() + 0.05)
# Note: don't convert data['bin'] to category
bar = alt.Chart(data_bar).mark_bar().encode(
x=alt.X('bin', bin='binned'),
x2='bin_max',
y='freq')
Upvotes: 1