biowhat
biowhat

Reputation: 17

Shared triples between two knowledge graphs

I want to compare two semantic Knowledge Graphs, to see if they have any triple in common, using cypher.

MATCH (n1)-[r1]-(c1)
MATCH (n2)-[r2]-(c2)  
WHERE r1.filePath = "../data/graph1.json" 
AND r2.filePath = "../data/graph2.json"
AND n1 = n2
AND r1 = r2
AND c1 = c2
RETURN n1, n2, r1, r2, c1, c2

The two graphs are loaded in neo4j, and the only way to distinguish them is the rel property "filePath".

Is this the correct way of doing so? Are there other algorithms to search for similarities between graphs?

Upvotes: 0

Views: 142

Answers (2)

cybersam
cybersam

Reputation: 67019

If you don't care whether r1 and r2 have the same directionality, nor whether they have the same type, this simple query should be sufficient to find all "common triples" between the 2 knowledge graphs (that share the same node pair):

MATCH (n)-[r1]-(c)-[r2]-(n)
WHERE r1.filePath <> r2.filePath
RETURN n, c, r1, r2

Or, if you want them to have the same directionality:

MATCH (n)-[r1]-(c)-[r2]-(n)
WHERE r1.filePath <> r2.filePath AND ENDNODE(r1) = ENDNODE(r2)
RETURN n, c, r1, r2

Or, if you want the same directionality and type (but do not want to specify a specific type in the MATCH pattern):

MATCH (n)-[r1]-(c)-[r2]-(n)
WHERE TYPE(r1) = TYPE(r2) AND r1.filePath <> r2.filePath AND ENDNODE(r1) = ENDNODE(r2)
RETURN n, c, r1, r2

Also, I'd suggest a shorter relationship property value, as long values can be wasteful of storage space. Something like "g1" and "g2" may be sufficient for your needs.

Upvotes: 1

Vincent Rupp
Vincent Rupp

Reputation: 655

You say "two semantic Knowledge Graphs" which leads me to think you have two Neo4j databases, but your code implies you have one graph database with (possibly) two different sets of data. I'll assume it's the latter.

Your code, as is, won't work the way you intend. By saying r1 = r2, you're establishing that these are the exact same relationship object. You might want type(r1) = type(r2), but that would be better specified as:

MATCH (n1)-[r1:MY_REL_TYPE]-(c1)
MATCH (n2)-[r2:MY_REL_TYPE]-(c2)

Further, saying n1 = n2 and c1 = c2 implies you are looking for nodes that have two MY_REL_TYPE relationships between them with the different properties. If that's what you want, it would be simpler as:

MATCH (n1)-[r1:MY_REL_TYPE]-(c1)
MATCH (n1)-[r2:MY_REL_TYPE]-(c1)
WHERE r1.filePath = "../data/graph1.json" 
AND r2.filePath = "../data/graph2.json"
RETURN n1, r1, r2, c1

However, if n1 and n2 are intended to be separate nodes, you probably want the labels to match:

MATCH (n1:MyNodeLabel)-[r1:MY_REL_TYPE]-(c1:MyOtherNodeLabel)
MATCH (n2:MyNodeLabel)-[r2:MY_REL_TYPE]-(c2:MyOtherNodeLabel)
WHERE r1.filePath = "../data/graph1.json" 
AND r2.filePath = "../data/graph2.json"
RETURN n1, n2, r1, r2, c1, c2

Finally, yes there are algorithms to find graph similarity, but I'm not sure that's your goal here since you want specific things to match, not an overall picture.

Hope that helps. :)

Upvotes: 0

Related Questions