Akira
Akira

Reputation: 2870

Error when removing nodes that do not have a certain attrubute from the graph

I've generated a graph with networkx. Then I add attribute 'commu_per' to some nodes. Then would like to remove those nodes that do not have attribute 'commu_per'.

import urllib3
import io
import networkx as nx
from networkx.algorithms import community

## Import dataset
http = urllib3.PoolManager()
url = 'https://raw.githubusercontent.com/leanhdung1994/WebMining/main/lesmis.gml'
f = http.request('GET', url)
data = io.BytesIO(f.data)
g = nx.read_gml(data)

## Define a function to add attributes
def add_att(g, att, att_name):
    att = list(att)
    for i in g.nodes():
        for j in range(len(att)):
            if i in list(att[j]):
                nx.set_node_attributes(g, {i: j}, name = att_name)
                break

## Add attributes
commu_per = community.k_clique_communities(g, 3)   
add_att(g, commu_per, 'commu_per')

g_1 = g
## Remove nodes which do not have attribute 'commu_per'
for i in g.nodes:
    if 'commu_per' not in g.nodes[i]:
        g_1.remove_node(i)

Then it returns an error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-6-7339f3f2ea6a> in <module>
     26 g_1 = g
     27 ## Remove nodes which do not have attribute 'commu_per'
---> 28 for i in g.nodes:
     29     if 'commu_per' not in g.nodes[i]:
     30         g_1.remove_node(i)

RuntimeError: dictionary changed size during iteration

Could you please elaborate on how to solve this error?

Upvotes: 0

Views: 181

Answers (3)

3khdr
3khdr

Reputation: 59

delete = [i for i in g.nodes if 'commu_per' not in g.nodes[node]]
g.remove_nodes_from(delete)

Upvotes: 1

Joel
Joel

Reputation: 23887

Your problem is caused by the fact that networkx stores the graph in a data structure based on dictionaries.

When you do a for loop to step through a dictionary, python will run into trouble if the dictionary itself changes.

So what you'll need to do is somehow create a list or some other collection of the nodes that don't have the attribute and then delete them.

If you aren't yet comfortable with "list comprehensions" you can do it like this:

nodes_to_delete = []
for node in g.nodes:
    if 'commu_per' not in g.nodes[node]:
    nodes_to_delete.append(node)
g.remove_nodes_from(nodes_to_delete)

To do it with a list comprehension (which eliminates the for loop) you can do

nodes_to_delete = [node for node in g.nodes if 'commu_per' not in g.nodes[node]]
g.remove_nodes_from(nodes_to_delete)

Upvotes: 1

lllrnr101
lllrnr101

Reputation: 2343

You g and g1 dict objects are same. So getting an iterator on 1 and using that to try and delete the other will not work.

>>> a  = {1:10, 2:20}
>>> b  = a
>>> id(b) == id(a)
True
>>> b[4] = 40
>>> id(b) == id(a)
True
>>> b
{1: 10, 2: 20, 4: 40}
>>> a
{1: 10, 2: 20, 4: 40}
>>> 

Use copy() method to get a new copy so that you can remove the keys while iterating on same object.

>>> c = b.copy()
>>> id(b) == id(c)
False
>>> c[5] = 50
>>> c
{1: 10, 2: 20, 4: 40, 5: 50}
>>> 
>>> b
{1: 10, 2: 20, 4: 40}
>>> 

Another method is to use for i in list(g.nodes)

Upvotes: 1

Related Questions