Reputation: 3
I keep getting the above error when trying to run the greedy_modularity_communities community-finding algorithm from NetworkX on a network of 123212 nodes and 329512 edges.
simpledatasetNX here is a NetworkX Graph object. Here is what I most recently ran:
greedy_modularity_communities(simpledatasetNX)
and what has been output:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-a3b0c8705138> in <module>()
----> 1 greedy_modularity_communities(simpledatasetNX)
2 frames
/usr/local/lib/python3.7/dist-packages/networkx/algorithms/community/modularity_max.py in greedy_modularity_communities(G, weight, resolution)
98 if j != i
99 }
--> 100 for i in range(N)
101 }
102 dq_heap = [
/usr/local/lib/python3.7/dist-packages/networkx/algorithms/community/modularity_max.py in <dictcomp>(.0)
98 if j != i
99 }
--> 100 for i in range(N)
101 }
102 dq_heap = [
/usr/local/lib/python3.7/dist-packages/networkx/algorithms/community/modularity_max.py in <dictcomp>(.0)
96 - 2 * resolution * k[i] * k[j] * q0 * q0
97 for j in [node_for_label[u] for u in G.neighbors(label_for_node[i])]
---> 98 if j != i
99 }
100 for i in range(N)
AttributeError: 'NoneType' object has no attribute 'get'
I've been running into this exact error several times after multiple attempts to remedy this. Here's where I started, and the things I did to fix it:
desired_graph_object = nx.Graph(multidigraphobject)
and it seemed to do what I wanted it to do on a much simpler test graph I made, so I did this and then tried to run the algorithm, getting the above error.
I now suspect that there is something wrong with the way I constructed both of these graphs as the nature of the error is the same whether I run the algorithm on the simplified multidigraph object or on the graph object.
Here is my code for constructing the graph object; the code for constructing the multidigraph object was essentially copy-pasted with minor adjustments to make this, so I don't feel the need to include it. As far as I can tell, both the graph object this code constructs and the multidigraph object discussed previously look how I intended them to.
%%time
for index, row in df.iterrows(): # *Go through our dataframe row by row*
if row["link_id"] != row["parent_id"]: # *Check that the current row is a response to someone else*
link_id_df = link_id_dataframe_dict[row["link_id"]] # *Get the desired thread's dataframe*
for index2, row2 in link_id_df.iterrows(): # *Iterate through the thread dataframe's rows (comments)*
if (row2["id"] in row["parent_id"]) and ( (row["author"],row2["author"]) not in nx.edges(G) ): # *Go until we find the comment whose id matches our original comment's parent_id, AND check that our current potential edge isn't already an edge*
G.add_edge(row["author"],row2["author"]) # *Add the desired edge.*
if row["subreddit"] == ("Daddit" or "daddit"): # *This line and the next three, add the necessary edge attributes.*
nx.set_edge_attributes(G,{(row["author"],row2["author"]): {"daddit": 1, "mommit": 0}})
else:
nx.set_edge_attributes(G,{(row["author"],row2["author"]): {"daddit": 0, "mommit": 1}})
elif (row2["id"] in row["parent_id"]) and ( (row["author"],row2["author"]) in nx.edges(G) ): # *If the edge already exists, ie these two users have interacted before, increase appropriate comment quantity*
if row["subreddit"] == ("Daddit" or "daddit"):
G[row["author"]][row2["author"]]["daddit"] += 1
else:
G[row["author"]][row2["author"]]["mommit"] += 1
Some additional context: my original dataset is a massive data frame that I wanted to construct my network from. Each row represents a comment or post on a social media site. It involves linking id's of comments to the parent_id's of comments that reply to the first comment. The link_id_dataframe_dict is a dictionary where a key is a given thread and the object associated with that key is a subdataframe of all the comments in that thread (ie, with that link_id).
The idea is that we go through our entire data frame row by row, identify the thread/link_id that this row/comment is part of, then we search through the associated link_id data frame for the other row/comment that the row/comment we are considering is a response to. When we do so, we add an edge between two nodes, where this edge represents the comment, and the two nodes are the users who posted the reply and the comment being replied to. We also make a note of which community this comment reply took place in by adding attribute 1 labeled with that community, and a zero for the other community as a way of keeping track of where these users are interacting. For this version of the code, if these users have interacted before, we note that as well by adding one to the attribute representing the community in which the new interaction has taken place.
UPDATES:
I removed the self-loops from the graph yet still run into the same error unfortunately.
Upvotes: 0
Views: 1574
Reputation: 483
Updating networkx from 2.6.2 to 2.6.3 resolved this issue for me.
Upvotes: 1
Reputation: 23887
It looks like you've encountered a known bug which has been corrected. More details are here:
https://github.com/networkx/networkx/pull/4996
I don't think it's yet in the most recent released version of networkx (it looks like it will appear in version 2.7), but if you replace the algorithm you're using with the code here: https://github.com/networkx/networkx/blob/main/networkx/algorithms/community/modularity_max.py it should fix this.
Upvotes: 0