Manoel Ribeiro
Manoel Ribeiro

Reputation: 374

Back-reference neo4j

Can I use some back-reference sort of mechanism in neo4j? I'm not interested in what matches the query, just that it is the same thing in many places. Something like:

MATCH (a:Event {diagnosis1:11})
MATCH (b:Event {diagnosis1:15})
MATCH (c:Event {diagnosis1:5})
MATCH (a)-[rel:Next {PatientID:*}]->(b)
MATCH (b)-[rel1:Next {PatientID:\{1}]->(c)

The idea is that I just require the attribute IDs from both edges to be the same, without specifying it. The whole purpose of it would be not generating all possible matchings, to then filter them, but only hop in the specific places.

I've asked something similar in a more specific way here.

Edit: I know WHERE clauses can be used for that, but they filter the query AFTER matching the edges and nodes. I want to do that DURING the matching!

Upvotes: 0

Views: 56

Answers (1)

Frank Pavageau
Frank Pavageau

Reputation: 11705

Use a WHERE clause with simple references, there's no need for back references:

MATCH (a)-[rel:Next]->(b)
MATCH (b)-[rel1:Next]->(c)
WHERE rel.PatientID = rel1.PatientID

Update

First of all, Cypher is a declarative query language: you express what you want, the runtime takes care of executing and optimizing it any way it can, so it's not that obvious that it would do it the way you think it will, or that using "back references" would magically solve the problem; it's just another way of writing the same thing.

So, your problem is that the match creates all the relationship pairs before filtering them. How about splitting the match in 2 phases using WITH?

MATCH (a:Event {diagnosis1:11})-[rel:Next]->(b:Event {diagnosis1:15})
WITH a, b, rel
MATCH (b)-[rel1:Next]->(c:Event {diagnosis1:5})
WHERE rel1.PatientID = rel.PatientID

That should only select the second relationships that match the first, but I'm not sure if it's an O(n^2) algorithm in Cypher's runtime.

Otherwise, if you drop to the Java API (which would mean either an extension or a procedure, depending on your version of Neo4j), you can probably implement in O(n) by

  • scanning all the relationships between a and b, indexing them by PatientID in some multimap (see Guava, or use a Map<K, Collection<V>>); this is O(n)
  • then doing the same for all the relationships between b and c, still O(n)
  • iterate on the keys of one multimap to get the values in both and match them, still O(n)

Upvotes: 4

Related Questions