Denver Dang
Denver Dang

Reputation: 2615

Plotly graph object laggy when plotting many additional add_trace

I am trying to create a map, where I need to draw a line between several nodes/points. There is approximately 2000 node pairs that need a line drawn between them.

I have a data frame containing the longitude/latitude coords for each node pair (i.e. col1 and col2), and then I plot it by the following:

fig = go.Figure(go.Scattergeo())
for idx, row in df.iterrows():
    fig.add_trace(
        go.Scattergeo(
            mode="markers+lines",
            lat=[row["col1"][0], row["col2"][0]],
            lon=[row["col1"][1], row["col2"][1]],
            marker={"size": 10},
        )
    )

fig.show()

So I just run through the data frame and plot each node pair. However, my issue is that if I plot beyond 400-500 pairs, the resulting plot is very slow to render, and the zoom/drag effect are also not very good.

I am not sure if I can optimize on this. I'm guessing the issue is that I create that many add_trace objects, but I can't seem to figure out how else to draw a line between pairs only. If I just give all latitude and longitude coords to the lat and lon args, then I will plot all points, and then just draw a line between everything - which is not intended.

So yeah, any ideas ?

Upvotes: 1

Views: 1466

Answers (1)

r-beginners
r-beginners

Reputation: 35115

As for performance, there are differences in operation depending on the execution environment, so my answer is that changing the way data for graphs is held will have the effect of improving overall execution speed and increasing memory efficiency by making the internal graph structure a single structure.

import plotly.graph_objects as go
import pandas as pd
import numpy as np

N = 2000
df = pd.DataFrame({'col1': [(x0,x1) for x0,x1 in zip(np.random.uniform(25.79325, 48.79275, N), np.random.uniform(-70.30875, -124.2460278, N))],
                   'col2': [(y0,y1) for y0,y1 in zip(np.random.uniform(25.79325, 48.79275, N), np.random.uniform(-70.30875, -124.2460278, N))]})

%%timeit -r 1 -n 1
fig = go.Figure()

lats = []
lons = []
for idx, row in df.iterrows():
    lat1, lat2 = row["col1"][0], row["col2"][0]
    lon1, lon2 = row["col1"][1], row["col2"][1]
    lats.append(lat1)
    lats.append(lat2)
    lons.append(lon1)
    lons.append(lon2)
    lats.append(None)
    lons.append(None)

fig.add_trace(
    go.Scattergeo(
        mode="markers+lines",
        lat=lats,
        lon=lons,
        marker={"size": 10},
        line = dict(width = 2, color = 'blue'),
        showlegend=False
    ))

fig.update_geos(fitbounds="locations", visible=True)
fig.update_layout(autosize=True, height=500, margin=dict(r=0,t=25,l=0,b=0))
fig.show()
print(len(fig.data))

If you run it with your code:

2000(go.Scatter())
1.05 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

Result of my code

1(go.Scatter())
101 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

Upvotes: 3

Related Questions