mismatch between plotted networkx graph and dataframe entries

Question

I'm trying to build a directed graph from a dataframe containing my node and edge data. The graph is drawn, but when I try to assign alpha values or specific width to my edges, I realized there is a mismatch between the data in the dataframe and what is being drawn by networkx (the edges receive the wrong width).

Here is my code:

df = df.drop_duplicates()
df = df.reset_index(drop=True)

edge_list = df.loc[:, ['f', 't', 'v']]
edge_list.to_csv('edges.csv', index=False, header=False)

G = read_edgelist('edges.csv', delimiter=',', create_using=MultiDiGraph(), data=[('weight', float)],
                    edgetype=float)

pos = nx_pydot.pydot_layout(G, prog='dot')

plot.figure(figsize=(10, 10), dpi=150)

draw_networkx_nodes(G, pos, node_color='skyblue', node_size=5000, nodelist=nodes)
draw_networkx_nodes(G, pos, node_size=2000, node_color='r', nodelist=[address], node_shape='s', edgecolors='black')
draw_networkx_nodes(G, pos, node_size=2000, node_color='r', nodelist=taintsources, node_shape='s', edgecolors='black')
edges = draw_networkx_edges(G, pos, arrows=True, arrowsize=30, arrowstyle='->', edge_color='black')
draw_networkx_labels(G, pos, font_size=9)

# set alphas
i = 0
for a in df['v']:
    edges[i].set_alpha(a)
    i += 1

plot.show()

Now, the data in the dataframe (df after reindexing and dropping) is as follows:

                f                                           t          v                                                  l
0   0xdbd838...  0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be   4.999775  0xaeaac2670575ca1602b598401c43e85513edf7e99974...
1   0xe6c334...  0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be   1.507629  0xcad1b3c29d03dc55234334d906e61dde140b91985a13...
2   0xec7bcd...  0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be   1.428406  0x419685acc8b968b48536d190d2c50dffc7fda8fb8579...
3   0x1fe81d...  0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be   2.973072  0xac8d0f5c672b5e27dad3687606bc2aedffc3611fa2f8...
4   0xe6c334...  0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be   0.714586  0xaa27468c07ba13b185f83e71934ab0e0aa684570faf6...
5   0xdbd838...  0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be   0.714511  0x56a0783f46e8176df3b5833480e6565e3110e8ba952d...
6   0xa92189...                                 0xdbd838...   5.000000  0xda791bbba0fd49e733970ad9b48d6c1fff02d5e93b1b...
7   0x523564...                                 0xa92189...   5.000255  0x0c5e6548f5285520fc03a7ebf5f636e9475c1f678db8...
8   0x5abf99...                                 0x523564...  10.714286  0x24699357078cc1cfe6b3d57a67ffab18ef132dc86996...
9   0xc50be6...                                 0xe6c334...   1.507929  0xe6bf05d6c99db12d1735a62f2c3a9df37941025de2d6...
10  0x523564...                                 0xc50be6...   1.508184  0x62c7a7793294ac46094ceb0c580fcc8593575a8c6fbc...
11  0x329fda...                                 0x1fe81d...   2.977357  0x66bd13433f5ab207aced390ba2c915f913556105f010...
12  0x9dc588...                                 0x329fda...   2.977582  0x09714e7e2dedec960801b74d4838aac99ded96bbb29e...
13  0x523564...                                 0x9dc588...   2.977837  0xe83750f61b3bde48c39ce384d3b480e393e39c78d2fb...
14  0x68a419...                                 0xec7bcd...   1.428571  0xecba439590735ca8cdd69ba5669c8fdcbda68eefeee1...
15  0xfb08f9...                                 0x68a419...   1.428826  0x6c5d9dc5af2074b1ff81ef7f700df837693d6fba21b5...
16  0x523564...                                 0xfb08f9...   1.494462  0x5d404be34d7108a3b029340f39fe2e62e6983dba858f...

There are two problems now: df contains 17 entries, whereas the graph only contains 15(?) edges and therefore the weights are not assigned to the correct edges. The resulting graph (plot.show()) is having some clearly wrong assignments when it comes to the width of the arrows (widest arrow at wrong edge). I guess some edges are being merged in the graph and that results in the mismatch. How can I prevent this? How do I do this right? I'm really thankful for your inputs! :)

Edit1: Here is my data used in this code (as JSON string):

address = "0x3f5ce5fbfe3e9af3971dd833d26ba9b5c936f0be"
taintsources = ["0x5abf99..."]
nodes = ["0x9dc588...", "0xec7bcd...", "0xdbd838...", "0xc50be6...", "0xa92189...", "0x523564...", "0x1fe81d...", "0xe6c334...", "0x68a419...", "0xfb08f9...", "0x329fda..."]

df (after dropping and resetting the index):

https://pastebin.com/vc8L665V (alpha scaling)

~~https://pastebin.com/JyDLwdNJ~~ (width scaling)

Edit2: Code adjustments for more context. Also adjusted the df-values, as the v-column is now scaled between 0.1 and 1.0 (to match alpha channels) instead of scaled from 1-10 (when previously trying to set a different arrow width per edge).

Edit3: added image: As it is visible, the edge between 0x5abf99... and 0x523564... does not have a solid connection, but according to the dataframe, it should.

mismatch between plotted networkx graph and dataframe entries

Answers (1)

Related Questions