Reputation: 1277
I have the following undirected and unweighted graph and I want to measure the quality of the clustering algorithm. For this measurement I want the answer for the question:
How many unique edges are there between vertices of a single cluster?
For example: the cluster red
would have 6 edges, the cluster blue
would have 4 edges and the cluster green
would have 4 edges.
This is the code I used to generate the graph:
import networkx as nx
G = nx.Graph(directed=False).to_undirected()
G.add_edges_from([
("peter", "missy"),
("peter", "longfellow"),
("missy", "rhinehardt"),
("missy", "vivian"),
("brandon", "longfellow"),
("brandon", "zoe"),
("longfellow", "flash"),
("longfellow", "ox"),
("longfellow", "heather"),
("rhinehardt", "ox"),
("rhinehardt", "zostra"),
("rhinehardt", "vivian"),
("ox", "jenny"),
("vivian", "zostra"),
("vivian", "sarah"),
("flash", "zoe"),
("flash", "zostra"),
("flash", "heather"),
("zoe", "mathilda"),
("heather", "caitlyn"),
("heather", "sarah"),
("zostra", "mathilda"),
("zostra", "jenny"),
("sarah", "caitlyn"),
("caitlyn", "jenny")
])
Upvotes: 2
Views: 1749
Reputation: 35
The intra_community_edges
function mentioned by Sparky05 really only applies the Graph's size()
function to a subgraph. So assuming that the OP got the clusters in question from some function and that therefore the ids of the nodes will be available as some sort of iterable, you don't need to use the intra_community_edges
function (which has a slightly different use case) but can easily get the number of edges as follows:
nodes = ["caitlyn", "jenny", "zostra", "ox", "flash"]
SG = nx.Graph(self.G.subgraph(nodes))
edgenos = SG.size()
Note that the nodes given to the subgraph()
function need to be identifiers of nodes existing in your original graph G
. Only then will the subraph automatically contain all edges from G
that have a source node as well as a target node present in nodes
.
The size()
function can also be used to give you the total sum of all the edge wheights in a weighted (sub)graph:
wheighted_egdenos = SG.size(weight='weightname') # 'weightname' needs to be the name of the edge property holding the edge's weight
See the networkx documentation for Graph.size.
(Sorry for making this a whole new answer, when it's only an elaboration of Sparky05's hint - I would have added a comment to the previous answer, but I didn't have enough reputation for it.)
Upvotes: 0
Reputation: 4892
You may also want to consider the quality measures offered by networkx: Measuring Partitions. It includes coverage, modularity, and performance.
If you take a look at the code, you will also find methods for intra_community_edges
and inter_community_edges
.
Upvotes: 1
Reputation: 1277
Example for the green cluster
# Original cluster
cluster = set(["caitlyn", "jenny", "zostra", "ox", "flash"])
# Searching for external vertices between two of cluster's vertices
# Can be more efficient if the inner loop starts from the current position
# of the outer loop
for u in cluster:
for v in cluster:
for between in list(nx.shortest_path(G, u, v)):
cluster.add(between)
# Create subgraph and count edges
subgraph = G.subgraph(list(cluster))
print(len(subgraph.edges()))
Upvotes: 1