Johannes
Johannes

Reputation: 111

How do I calculate the Graph Edit Distance with networkx(Python)?

I am working with the graph edit distance; According to the definition it is the minimum sum of costs to transform the original graph G1 into a graph that is isomorphic to G2;

The graph edit operations typically include:

Now I want to use the implementation networkx has - I do not have any edge labels, the node set of G1 and G2 is the same and I do not want a graph isomorphic to G2 but I want G2 itself;

This is mainly because G1: 1->2->3 and G2: 3->2->1 are isomorphic to each other but if the nodes represent some events, from a perspective of causality, they are very very different;

So in this context, I've been running a test like the following:

import networkx as nx

G=nx.DiGraph()
G.add_node(1)
G.add_node(2)
G.add_node(3)
G.add_edges_from([(1, 2),(2,3)])


G2=nx.DiGraph()
G2.add_node(1)
G2.add_node(2)
G2.add_node(3)
G2.add_edges_from([(3, 2),(2, 1)])


nx.graph_edit_distance(G,G2)

But it returns that the distance is zero which makes sense because the graphs are isomorphic to each other;

So I tried to set up node_match but still no luck

import networkx as nx

def nmatch(n1, n2):
    return n1==n2 

G=nx.DiGraph()
G.add_node(1)
G.add_node(2)
G.add_node(3)
G.add_edges_from([(1, 2),(2,3)])


G2=nx.DiGraph()
G2.add_node(1)
G2.add_node(2)
G2.add_node(3)
G2.add_edges_from([(3, 2),(2, 1)])


nx.graph_edit_distance(G,G2, node_match=nmatch)

If we assume a cost of 1 to delete or add an edge/ vertex, then the edit distance should be 4, because we can:

How would it be suitable to calculate the edit distance not considering isomorphy but really equivalence?

Upvotes: 6

Views: 3185

Answers (2)

Olha Kholod
Olha Kholod

Reputation: 559

This is another solution, which produce 2

import networkx as nx

G=nx.DiGraph()
G.add_node(1, id=1)
G.add_node(2, id=2)
G.add_node(3, id=3)
G.add_edges_from([(1, 2),(2,3)])


G2=nx.DiGraph()
G2.add_node(1, id=1)
G2.add_node(2, id=2)
G2.add_node(3, id=3)
G2.add_edges_from([(3, 2),(2,1)])

# arguments
# arguments for nodes
def node_subst_cost(node1, node2):
    # check if the nodes are equal, if yes then apply no cost, else apply 1
    if node1 == node2:
        return 0
    return 1

def node_del_cost(node):
    return 1  # here you apply the cost for node deletion

def node_ins_cost(node):
    return 1  # here you apply the cost for node insertion

# arguments for edges
def edge_subst_cost(edge1, edge2):
    # check if the edges are equal, if yes then apply no cost, else apply 3
    if edge1==edge2:
        return 0
    return 1

def edge_del_cost(node):
    return 1  # here you apply the cost for edge deletion

def edge_ins_cost(node):
    return 1  # here you apply the cost for edge insertion

paths, cost = nx.optimal_edit_paths(
    G,
    G2,
    node_subst_cost=node_subst_cost,
    node_del_cost=node_del_cost,
    node_ins_cost=node_ins_cost,
    edge_subst_cost=edge_subst_cost,
    edge_del_cost=edge_del_cost,
    edge_ins_cost=edge_ins_cost
)

print(cost)

If you run it on Python 2.7, add the following lines to the header

# This Python file uses the following encoding: utf-8
from __future__ import print_function, unicode_literals
from __future__ import absolute_import, division 

Upvotes: 0

It doesn't seem that you are comparing what you want to. n1 and n2 in nmatch are always {}. From documentation

(...) That is, the function will receive the node attribute dictionaries for n1 and n2 as inputs.

you are not comparing the nodes object, but dictionaries associated with them (as any data you need)

You can add your custom data to that dictionary when adding nodes, for example:

import networkx as nx

def nmatch(n1, n2):
    return n1==n2

G=nx.DiGraph()
G.add_node(1, id=1)
G.add_node(2, id=2)
G.add_node(3, id=3)
G.add_edges_from([(1, 2),(2,3)])


G2=nx.DiGraph()
G2.add_node(1, id=1)
G2.add_node(2, id=2)
G2.add_node(3, id=3)
G2.add_edges_from([(3, 2),(2,1)])


nx.graph_edit_distance(G,G2, node_match=nmatch)

returns 2, as you can do 2 edge substitutions. You could probably increase substitution cost if you wanted result to be 4 (2 insertions, 2 deletions)

Upvotes: 3

Related Questions