Reputation: 1035
Here's a dummy data set I created to illustrate my problem:
import pandas as pd
import numpy as np
import altair as alt
# Use more rows than recommended so it's a bit easier to
# see the slowdown with human eyes.
alt.data_transformers.disable_max_rows()
# Create N rows with 2 columns.
N=10000
test_df = pd.DataFrame({'t' : range(0, N, 1), # Integers counting up from 0 to N - 1.
'A' : np.random.randint(0, 100, size=N)}) # random integer between 0 and 100.
I use it to create a line plot:
test_line = alt.Chart(test_df).mark_line().encode(x='t:Q', y='A:Q').interactive(bind_y=False)
All is good, even with more than 5000 rows, the rendered interactive line chart is snappy when I pan-and-zoom. This is obviously machine-dependent (ie, running this code on a different machine will result in different performance in the pan-and-zoom).
Moving forward, I tried to draw a vertical line to mark a point of interest. The code is inspired by this.
v_rule = alt.Chart(test_df).mark_rule(color='red').encode(x='a:Q').transform_calculate(a=str(5000))
alt.layer(test_line, v_rule).display()
With this, the resulting interactive line chart is slow to pan-and-zoom with the vertical line on-screen:
If I move the plot around such that the vertical line is not on-screen, the pan-and-zoom becomes snappy again.
This problem becomes much worse when I try to concatenate multiple plots together, each of which is also interactive and has a vertical line.
Is there a better way to draw this vertical line? Some way to pre-render everything in advance and save to a local file? I am surprised (and confused) how a single line could be so detrimental to the rendering speed.
Upvotes: 2
Views: 502
Reputation: 48889
It is not the vertical line but tranform_calculate
that causes the performance penalty. The reason is that the value is being calculated for each row in your dataframe, you can see that if you click on the three dots action button, "Open chart in Vega Editor", click Data Viewer
to the right and select data_1
. I believe this is also the reason why the line looks a bit thicker in the plot, there might be 10000 lines on top of each other.
To create just a single line, you can do this instead:
v_rule = alt.Chart(pd.DataFrame({'a': [5000]})).mark_rule(color='red').encode(x='a')
alt.layer(test_line, v_rule)
Upvotes: 2