Reputation: 3580
I am trying to plot something with a huge number of data points (2mm-3mm) using plotly.
When I run
py.iplot(fig, filename='test plot')
I get the following error:
Woah there! Look at all those points! Due to browser limitations, the Plotly SVG drawing functions have a hard time graphing more than 500k data points for line charts, or 40k points for other types of charts. Here are some suggestions:
(1) Use the `plotly.graph_objs.Scattergl` trace object to generate a WebGl graph.
(2) Trying using the image API to return an image instead of a graph URL
(3) Use matplotlib
(4) See if you can create your visualization with fewer data points
If the visualization you're using aggregates points (e.g., box plot, histogram, etc.) you can disregard this warning.
So then I try to save it with this:
py.image.save_as(fig, 'my_plot.png')
But then I get this error:
PlotlyRequestError: Unknown Image Server Error
How do I do this properly? I don't care if it's a still image or an interactive display within my notebook.
Upvotes: 12
Views: 23417
Reputation: 11
Use the WebGL render mode. I had a chart with ~500k points, which is very slow in browser if I use SVG. By changing to WebGL, it works like a charm.
You can find some examples of how to use WebGL in plotly here:
https://plotly.com/python/webgl-vs-svg/
Upvotes: 1
Reputation: 3115
You can try the render_mode
argument. Example:
import plotly.express as px
import pandas as pd
import numpy as np
N = int(1e6) # Number of points
df = pd.DataFrame(dict(x=np.random.randn(N),
y=np.random.randn(N)))
fig = px.scatter(df, x="x", y="y", render_mode='webgl')
fig.update_traces(marker_line=dict(width=1, color='DarkSlateGray'))
fig.show()
In my computer N=1e6
takes about 5 seconds until the plot is visible, and the "interactiveness" is still very good. With N=10e6
it takes about 1 minute and the plot is not responsive anymore (i.e. it is really slow to zoom, pan or anything).
Upvotes: 0
Reputation: 698
Plotly really seems to be very bad in this. I am just trying to create a boxplot with 5 Million points, which is no problem in the simple R function "boxplot", but plotly is calculating endlessly for this.
It should be a major issue to improve this. Not all data has to be saved (and shown) in the plotly object. This is the main problem I guess.
Upvotes: 13
Reputation: 2545
one option would be down-sampling your data, not sure if you'd like that: https://github.com/devoxi/lttb-py
I also have problems with plotly in the browser with large datasets - if anyone has solutions, please write! Thank you!
Upvotes: 8