Reputation: 1488
I'm new to networkx and could do with some help please.
I have a set of data which I've processed to generate the nodes and edges. There are around 5000 groups of nodes that have more than 2 links within them (up to 10 nodes in the group in total). But the problem is that there are also several thousand pairs of nodes that only have 1 edge between them, i.e node a is linked to node b but neither are linked to any other node.
I want to remove these paired nodes from the chart.
Is there a way to filter these out?
Upvotes: 7
Views: 4619
Reputation: 23907
So our goal is to remove all nodes from components with less than 3 nodes (this includes isolated nodes if they exist).
for component in list(nx.connected_components(G)):
if len(component)<3:
for node in component:
G.remove_node(node)
A small warning is in order when using nx.connected_components
. It returns a generator of components. If I didn't put list
around it, it would generate one at a time, and then perform the steps for the given component. Once all that is done, it would generate the next component. But because G
has been modified, python can't be sure that this behaves well. So it would die (complaining that a dictionary changed size --- the number of nodes in G
changed). By turning it into a list, the components are all found before it starts the loop. So the graph won't be changing while the components are being found.
Upvotes: 12