curiouscupcake
curiouscupcake

Reputation: 1277

Counting number of edges in a networkx sub-graph

I have the following undirected and unweighted graph and I want to measure the quality of the clustering algorithm. For this measurement I want the answer for the question:

How many unique edges are there between vertices of a single cluster?

clustered graph

For example: the cluster red would have 6 edges, the cluster blue would have 4 edges and the cluster green would have 4 edges.

This is the code I used to generate the graph:

import networkx as nx

G = nx.Graph(directed=False).to_undirected()

G.add_edges_from([
    ("peter", "missy"),
    ("peter", "longfellow"),
    
    ("missy", "rhinehardt"),
    ("missy", "vivian"),
    
    ("brandon", "longfellow"),
    ("brandon", "zoe"),
    
    ("longfellow", "flash"),
    ("longfellow", "ox"),
    ("longfellow", "heather"),
    
    ("rhinehardt", "ox"),
    ("rhinehardt", "zostra"),
    ("rhinehardt", "vivian"),
    
    ("ox", "jenny"),
    
    ("vivian", "zostra"),
    ("vivian", "sarah"),
    
    ("flash", "zoe"),
    ("flash", "zostra"),
    ("flash", "heather"),
    
    ("zoe", "mathilda"),
    
    ("heather", "caitlyn"),
    ("heather", "sarah"),
    
    ("zostra", "mathilda"),
    ("zostra", "jenny"),
    
    ("sarah", "caitlyn"),
    
    ("caitlyn", "jenny")
])

Upvotes: 2

Views: 1749

Answers (3)

AlinaOs
AlinaOs

Reputation: 35

The intra_community_edges function mentioned by Sparky05 really only applies the Graph's size() function to a subgraph. So assuming that the OP got the clusters in question from some function and that therefore the ids of the nodes will be available as some sort of iterable, you don't need to use the intra_community_edges function (which has a slightly different use case) but can easily get the number of edges as follows:

nodes = ["caitlyn", "jenny", "zostra", "ox", "flash"]
SG = nx.Graph(self.G.subgraph(nodes))
edgenos = SG.size()

Note that the nodes given to the subgraph() function need to be identifiers of nodes existing in your original graph G. Only then will the subraph automatically contain all edges from G that have a source node as well as a target node present in nodes.

The size() function can also be used to give you the total sum of all the edge wheights in a weighted (sub)graph:

wheighted_egdenos = SG.size(weight='weightname') # 'weightname' needs to be the name of the edge property holding the edge's weight

See the networkx documentation for Graph.size.

(Sorry for making this a whole new answer, when it's only an elaboration of Sparky05's hint - I would have added a comment to the previous answer, but I didn't have enough reputation for it.)

Upvotes: 0

Sparky05
Sparky05

Reputation: 4892

You may also want to consider the quality measures offered by networkx: Measuring Partitions. It includes coverage, modularity, and performance.

If you take a look at the code, you will also find methods for intra_community_edges and inter_community_edges.

Upvotes: 1

curiouscupcake
curiouscupcake

Reputation: 1277

Example for the green cluster

# Original cluster
cluster = set(["caitlyn", "jenny", "zostra", "ox", "flash"])

# Searching for external vertices between two of cluster's vertices 
# Can be more efficient if the inner loop starts from the current position 
# of the outer loop
for u in cluster:
    for v in cluster:
        for between in list(nx.shortest_path(G, u, v)):
            cluster.add(between)

# Create subgraph and count edges
subgraph = G.subgraph(list(cluster))
print(len(subgraph.edges()))

Upvotes: 1

Related Questions