Jai
Jai

Reputation: 77

Cypher Query for neo4j to get the desired traversal

I am Modifying the Question so that it can be easily tested

Graph being used for testing

enter image description here

Green Nodes are Organizations and Blue Nodes are Person. Here is the script to create this graph:

CREATE (A:Organization {PRID:'A', Name:'Organization-A'})
CREATE (B:Organization {PRID:'B', Name:'Organization-B'})
CREATE (C:Organization {PRID:'C', Name:'Organization-C'})
CREATE (D:Organization {PRID:'D', Name:'Organization-D'})
CREATE (E:Organization {PRID:'E', Name:'Organization-E'})
CREATE (F:Organization {PRID:'F', Name:'Organization-F'})
CREATE (G:Organization {PRID:'G', Name:'Organization-G'})
CREATE (H:Organization {PRID:'H', Name:'Organization-G'})
CREATE (I:Organization {PRID:'I', Name:'Organization-I'})


CREATE (P1:Person {PRID:'P1', Name:'Person-P1'})
CREATE (P2:Person {PRID:'P2', Name:'Person-P2'})
CREATE (P3:Person {PRID:'P3', Name:'Person-P3'})
CREATE (P4:Person {PRID:'P4', Name:'Person-P4'})
CREATE (P5:Person {PRID:'P5', Name:'Person-P5'})
CREATE (P6:Person {PRID:'P6', Name:'Person-P6'})

CREATE
  (B)-[:CONTROL]->(A),
  (C)-[:CONTROL]->(A),
  (D)-[:CONTROL]->(C),
  (E)-[:CONTROL]->(C),
  (G)-[:CONTROL]->(F),
  (H)-[:CONTROL]->(F),

  (D)-[:EMPLOYS]->(P1),
  (P1)-[:SPOUSE]->(P2),
  (P2)-[:CONSULTS]->(E),

  (B)-[:EMPLOYS]->(P3),
  (P3)-[:SPOUSE]->(P4),
  (P4)-[:CONSULTS]->(I),

  (H)-[:EMPLOYS]->(P5),
  (P5)-[:SPOUSE]->(P6)
;

I am trying to write a cypher query that needs to accomplish the following:

a) Start with a Node with PRID = 'C'

b) Path p1 = All the nodes connected to starting node with relationship types CONTROL (recursively) - irrespective of direction

c) Path p2= optionally match the following relationship pattern
(x1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(x2)
where (x1) and (x2) are nodes from the path p1 - found in step (b).

Return p1 and p2


Till now have tried the following three queries (with Brian's help)

Query1:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE node1 in nodes(p1) and node2 in nodes(p1)
RETURN p1,p2;

Quesry2:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH p1, nodes(p1) AS p1_nodes
UNWIND p1_nodes AS node1
UNWIND p1_nodes AS node2
WITH p1, node1, node2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)    
WHERE p2 IS NOT NULL
RETURN p1, p2;

Query3:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH p1, EXTRACT(node IN nodes(p1) | ID(node)) AS p1_node_ids
UNWIND p1_node_ids AS id1
UNWIND p1_node_ids AS id2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE ID(node1) = id1 AND ID(node2) = id2 AND p2 IS NOT NULL
RETURN p1, p2;

What I'd Expect is to get back the subgraph with nodes A, B, C, D, E,P1,P2 with the relationships, however all three just give me just A, B, C, D, E with the relationships (that is just p1, nothing from p2)

Some more queries we tried that work with some anchor nodes but not with any of the node in the first hierarchy

Query-4
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH collect({path: p1, node: y}) AS paths_and_nodes
UNWIND paths_and_nodes AS paths_and_node1
UNWIND paths_and_nodes AS paths_and_node2
WITH
  paths_and_node1.node AS node1,
  paths_and_node2.node AS node2,
  paths_and_node1.path AS path1,
  paths_and_node2.path AS path2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
RETURN path1, path2, p2, node1, node2

This one works with the x being specified as A, B or C. BUT Does not work if x points to D or E

Query 5
MATCH
  p1=(org1)-[:CONTROL*]-(x:Organization {PRID: 'C'})-[:CONTROL*]-(org2)
OPTIONAL MATCH p2=(org1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(org2)
RETURN p1, p2, org1, org2

This one works with the x being specified as C,D or E. BUT Does not work if x points to A or B

One thought - so if we have the query as

MATCH p1=(x:Organization {PRID: 'E'})-[r:CONTROL*]-(y)
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE node1.PRID in ['A','B','C','D','E'] AND 
      node2.PRID in ['A','B','C','D','E']
RETURN p1,p2;

then it obviously works fine. So cant we somehow using COLLECT etc create this array and pass it to the next query. The problem seems to be - if after the first match i use
WITH p1, COLLECT (y.PRID) AS p1_prids
the p1_prids is not ['A','B','C','D','E'] but rather a multi row collection with just one element each

One way i could make it work consistently is

MATCH (x:Organization {PRID: 'C'})-[r:CONTROL*0..]-(y)
WITH COLLECT (y.PRID) AS p1_prids
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*0..]-(y)
WITH p1,p1_prids
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE node1.PRID in p1_prids AND  node2.PRID in p1_prids

but i think this is very inelegant and performance nightmare as it does the query twice - so still looking for a solution ...

What am i doing wrong in this query ?
Is there a better way to approach this problem

Thanks in advance ...

Upvotes: 1

Views: 214

Answers (2)

Brian Underwood
Brian Underwood

Reputation: 10856

Ok, let me start over with a different answer for your new dataset (thanks for that, by the way, it was really helpful!)

The problem that I wasn't realizing was that the nodes that you want to match together will be in different results of the p1 path because they're on either side of the start node. So you could do something like this:

MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH collect({path: p1, node: y}) AS paths_and_nodes
UNWIND paths_and_nodes AS paths_and_node1
UNWIND paths_and_nodes AS paths_and_node2
WITH
  paths_and_node1.node AS node1,
  paths_and_node2.node AS node2,
  paths_and_node1.path AS path1,
  paths_and_node2.path AS path2
MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
RETURN path1, path2, p2, node1, node2

Or something like this which is a lot simpler:

MATCH
  p1=(org1)-[:CONTROL*]-(x:Organization {PRID: 'C'})-[:CONTROL*]-(org2),
  p2=(org1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(org2)
RETURN p1, p2, org1, org2

EDIT So I think I see the problem. I wasn't thinking about starting from any of the Organization nodes. I'm pretty sure the second query will never work because Cypher will never hit the same node twice in a path. So looking at the first query the reason it's not working with D and E is because the default [*] variable relationship definition is for one or more hops. If we let it be zero or more hops then it seems to work:

MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*0..]-(y)
WITH collect({path: p1, node: y}) AS paths_and_nodes
UNWIND paths_and_nodes AS paths_and_node1
UNWIND paths_and_nodes AS paths_and_node2
WITH
  paths_and_node1.node AS node1,
  paths_and_node2.node AS node2,
  paths_and_node1.path AS path1,
  paths_and_node2.path AS path2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
RETURN path1, path2, p2, node1, node2

How's that?

Upvotes: 1

Brian Underwood
Brian Underwood

Reputation: 10856

Let me give it a shot! ;)

So first off you can simplify your first MATCH/WHERE combo like this:

MATCH p1=(x:Organization {PID: '27762230'})-[r:`10006`|`10010`*]-(y) 

So let's take that and try to do what you want:

MATCH p1=(x:Organization {PID: '27762230'})-[r:`10006`|`10010`*]-(y) 
WITH p1, nodes(p1) AS p1_nodes
UNWIND p1_nodes AS node1
UNWIND p1_nodes AS node2
WITH p1, node1, node2
OPTIONAL MATCH p2=(node1)-[:`10004`]->()-[:`10051`]->()-[:`10052`]->(node2)
WHERE p2 IS NOT NULL
RETURN p1, p2

It could also be that when you call nodes(path) that the objects that you get aren't nodes as much as Maps of the node properties. If that's so we should be able to match via the IDs:

MATCH p1=(x:Organization {PID: '27762230'})-[r:`10006`|`10010`*]-(y) 
WITH p1, EXTRACT(node IN nodes(p1) | ID(node)) AS p1_node_ids
UNWIND p1_node_ids AS id1
UNWIND p1_node_ids AS id2
OPTIONAL MATCH p2=(node1)-[:`10004`]->()-[:`10051`]->()-[:`10052`]->(node2)
WHERE ID(node1) = id1 AND ID(node2) = id2 AND p2 IS NOT NULL
RETURN p1, p2

Upvotes: 0

Related Questions