Trexion Kameha
Trexion Kameha

Reputation: 3580

plotly: huge number of datapoints

I am trying to plot something with a huge number of data points (2mm-3mm) using plotly.

When I run

py.iplot(fig, filename='test plot')

I get the following error:

Woah there! Look at all those points! Due to browser limitations, the Plotly SVG drawing functions have a hard time graphing more than 500k data points for line charts, or 40k points for other types of charts. Here are some suggestions:
(1) Use the `plotly.graph_objs.Scattergl` trace object to generate a WebGl graph.
(2) Trying using the image API to return an image instead of a graph URL
(3) Use matplotlib
(4) See if you can create your visualization with fewer data points

If the visualization you're using aggregates points (e.g., box plot, histogram, etc.) you can disregard this warning.

So then I try to save it with this:

py.image.save_as(fig, 'my_plot.png')

But then I get this error:

PlotlyRequestError: Unknown Image Server Error

How do I do this properly? I don't care if it's a still image or an interactive display within my notebook.

Upvotes: 12

Views: 23417

Answers (4)

Yanqing Wang
Yanqing Wang

Reputation: 11

Use the WebGL render mode. I had a chart with ~500k points, which is very slow in browser if I use SVG. By changing to WebGL, it works like a charm.

You can find some examples of how to use WebGL in plotly here:

https://plotly.com/python/webgl-vs-svg/

Upvotes: 1

user171780
user171780

Reputation: 3115

You can try the render_mode argument. Example:

import plotly.express as px
import pandas as pd
import numpy as np

N = int(1e6) # Number of points

df = pd.DataFrame(dict(x=np.random.randn(N),
                       y=np.random.randn(N)))

fig = px.scatter(df, x="x", y="y", render_mode='webgl')
fig.update_traces(marker_line=dict(width=1, color='DarkSlateGray'))
fig.show()

In my computer N=1e6 takes about 5 seconds until the plot is visible, and the "interactiveness" is still very good. With N=10e6 it takes about 1 minute and the plot is not responsive anymore (i.e. it is really slow to zoom, pan or anything).

Upvotes: 0

PhilippPro
PhilippPro

Reputation: 698

Plotly really seems to be very bad in this. I am just trying to create a boxplot with 5 Million points, which is no problem in the simple R function "boxplot", but plotly is calculating endlessly for this.

It should be a major issue to improve this. Not all data has to be saved (and shown) in the plotly object. This is the main problem I guess.

Upvotes: 13

Petronella
Petronella

Reputation: 2545

one option would be down-sampling your data, not sure if you'd like that: https://github.com/devoxi/lttb-py

I also have problems with plotly in the browser with large datasets - if anyone has solutions, please write! Thank you!

Upvotes: 8

Related Questions