Make Plotly scatter plots faster for large datasets - Python

Question

I have a dataset that is about 300,000 rows. Here is a snippet of the dataset.

    id                datetime            results
0   30  2020-09-29 14:55:21+00             0.0424
1   30  2020-09-29 14:55:23+00             0.0424
2   31  2020-09-29 14:55:24+00             0.0424
3   31  2020-09-29 14:55:25+00             0.0424
4   32  2020-09-29 14:55:26+00             0.0424
5   32  2020-09-29 14:55:27+00             0.0424

I tried to use matplotlib for a scatter plot but it was really slow. I then moved on to Plotly as i have seen that scattergl creates interactive graphs fast which is exactly what i need. However, when i start plotting anything above 100,000 the graph is really slow and takes alot of time to render.

Here is the code i implemented:

import plotly.graph_objects as go

# readings is a pandas dataframe containing the data

def plot_scatter(df, x_column, y_column):
    fig = go.Figure(data=go.Scattergl(x=df[x_column], y=df[y_column], mode='markers')))

fig.show()

plot_scatter(readings, 'datetime', 'results')

I also tried to split the plotted points by id (as in each set of points with a certain id will have their own color, and for the id to show in the legend) but i tried several methods with little luck.

I would really appreciate some help on how to make a fast scatter plot(maybe there is something better than scattergl) and how to split the data on the graph by id.

Make Plotly scatter plots faster for large datasets - Python

Answers (1)

Related Questions