Reputation: 77
I am Modifying the Question so that it can be easily tested
Graph being used for testing
Green Nodes are Organizations and Blue Nodes are Person. Here is the script to create this graph:
CREATE (A:Organization {PRID:'A', Name:'Organization-A'})
CREATE (B:Organization {PRID:'B', Name:'Organization-B'})
CREATE (C:Organization {PRID:'C', Name:'Organization-C'})
CREATE (D:Organization {PRID:'D', Name:'Organization-D'})
CREATE (E:Organization {PRID:'E', Name:'Organization-E'})
CREATE (F:Organization {PRID:'F', Name:'Organization-F'})
CREATE (G:Organization {PRID:'G', Name:'Organization-G'})
CREATE (H:Organization {PRID:'H', Name:'Organization-G'})
CREATE (I:Organization {PRID:'I', Name:'Organization-I'})
CREATE (P1:Person {PRID:'P1', Name:'Person-P1'})
CREATE (P2:Person {PRID:'P2', Name:'Person-P2'})
CREATE (P3:Person {PRID:'P3', Name:'Person-P3'})
CREATE (P4:Person {PRID:'P4', Name:'Person-P4'})
CREATE (P5:Person {PRID:'P5', Name:'Person-P5'})
CREATE (P6:Person {PRID:'P6', Name:'Person-P6'})
CREATE
(B)-[:CONTROL]->(A),
(C)-[:CONTROL]->(A),
(D)-[:CONTROL]->(C),
(E)-[:CONTROL]->(C),
(G)-[:CONTROL]->(F),
(H)-[:CONTROL]->(F),
(D)-[:EMPLOYS]->(P1),
(P1)-[:SPOUSE]->(P2),
(P2)-[:CONSULTS]->(E),
(B)-[:EMPLOYS]->(P3),
(P3)-[:SPOUSE]->(P4),
(P4)-[:CONSULTS]->(I),
(H)-[:EMPLOYS]->(P5),
(P5)-[:SPOUSE]->(P6)
;
I am trying to write a cypher query that needs to accomplish the following:
a) Start with a Node with PRID = 'C'
b) Path p1 = All the nodes connected to starting node with relationship types CONTROL (recursively) - irrespective of direction
c) Path p2= optionally match the following relationship pattern
(x1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(x2)
where (x1) and (x2) are nodes from the path p1 - found in step (b).
Return p1 and p2
Till now have tried the following three queries (with Brian's help)
Query1:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE node1 in nodes(p1) and node2 in nodes(p1)
RETURN p1,p2;
Quesry2:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH p1, nodes(p1) AS p1_nodes
UNWIND p1_nodes AS node1
UNWIND p1_nodes AS node2
WITH p1, node1, node2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE p2 IS NOT NULL
RETURN p1, p2;
Query3:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH p1, EXTRACT(node IN nodes(p1) | ID(node)) AS p1_node_ids
UNWIND p1_node_ids AS id1
UNWIND p1_node_ids AS id2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE ID(node1) = id1 AND ID(node2) = id2 AND p2 IS NOT NULL
RETURN p1, p2;
What I'd Expect is to get back the subgraph with nodes A, B, C, D, E,P1,P2 with the relationships, however all three just give me just A, B, C, D, E with the relationships (that is just p1, nothing from p2)
Some more queries we tried that work with some anchor nodes but not with any of the node in the first hierarchy
Query-4
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH collect({path: p1, node: y}) AS paths_and_nodes
UNWIND paths_and_nodes AS paths_and_node1
UNWIND paths_and_nodes AS paths_and_node2
WITH
paths_and_node1.node AS node1,
paths_and_node2.node AS node2,
paths_and_node1.path AS path1,
paths_and_node2.path AS path2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
RETURN path1, path2, p2, node1, node2
This one works with the x being specified as A, B or C. BUT Does not work if x points to D or E
Query 5
MATCH
p1=(org1)-[:CONTROL*]-(x:Organization {PRID: 'C'})-[:CONTROL*]-(org2)
OPTIONAL MATCH p2=(org1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(org2)
RETURN p1, p2, org1, org2
This one works with the x being specified as C,D or E. BUT Does not work if x points to A or B
One thought - so if we have the query as
MATCH p1=(x:Organization {PRID: 'E'})-[r:CONTROL*]-(y)
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE node1.PRID in ['A','B','C','D','E'] AND
node2.PRID in ['A','B','C','D','E']
RETURN p1,p2;
then it obviously works fine. So cant we somehow using COLLECT etc create this array and pass it to the next query. The problem seems to be - if after the first match i use
WITH p1, COLLECT (y.PRID) AS p1_prids
the p1_prids is not ['A','B','C','D','E'] but rather a multi row collection with just one element each
One way i could make it work consistently is
MATCH (x:Organization {PRID: 'C'})-[r:CONTROL*0..]-(y)
WITH COLLECT (y.PRID) AS p1_prids
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*0..]-(y)
WITH p1,p1_prids
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
WHERE node1.PRID in p1_prids AND node2.PRID in p1_prids
but i think this is very inelegant and performance nightmare as it does the query twice - so still looking for a solution ...
What am i doing wrong in this query ?
Is there a better way to approach this problem
Thanks in advance ...
Upvotes: 1
Views: 214
Reputation: 10856
Ok, let me start over with a different answer for your new dataset (thanks for that, by the way, it was really helpful!)
The problem that I wasn't realizing was that the nodes that you want to match together will be in different results of the p1
path because they're on either side of the start node. So you could do something like this:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*]-(y)
WITH collect({path: p1, node: y}) AS paths_and_nodes
UNWIND paths_and_nodes AS paths_and_node1
UNWIND paths_and_nodes AS paths_and_node2
WITH
paths_and_node1.node AS node1,
paths_and_node2.node AS node2,
paths_and_node1.path AS path1,
paths_and_node2.path AS path2
MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
RETURN path1, path2, p2, node1, node2
Or something like this which is a lot simpler:
MATCH
p1=(org1)-[:CONTROL*]-(x:Organization {PRID: 'C'})-[:CONTROL*]-(org2),
p2=(org1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(org2)
RETURN p1, p2, org1, org2
EDIT
So I think I see the problem. I wasn't thinking about starting from any of the Organization
nodes. I'm pretty sure the second query will never work because Cypher will never hit the same node twice in a path. So looking at the first query the reason it's not working with D and E is because the default [*]
variable relationship definition is for one or more hops. If we let it be zero or more hops then it seems to work:
MATCH p1=(x:Organization {PRID: 'C'})-[r:CONTROL*0..]-(y)
WITH collect({path: p1, node: y}) AS paths_and_nodes
UNWIND paths_and_nodes AS paths_and_node1
UNWIND paths_and_nodes AS paths_and_node2
WITH
paths_and_node1.node AS node1,
paths_and_node2.node AS node2,
paths_and_node1.path AS path1,
paths_and_node2.path AS path2
OPTIONAL MATCH p2=(node1)-[:EMPLOYS]->()-[:SPOUSE]->()-[:CONSULTS]->(node2)
RETURN path1, path2, p2, node1, node2
How's that?
Upvotes: 1
Reputation: 10856
Let me give it a shot! ;)
So first off you can simplify your first MATCH
/WHERE
combo like this:
MATCH p1=(x:Organization {PID: '27762230'})-[r:`10006`|`10010`*]-(y)
So let's take that and try to do what you want:
MATCH p1=(x:Organization {PID: '27762230'})-[r:`10006`|`10010`*]-(y)
WITH p1, nodes(p1) AS p1_nodes
UNWIND p1_nodes AS node1
UNWIND p1_nodes AS node2
WITH p1, node1, node2
OPTIONAL MATCH p2=(node1)-[:`10004`]->()-[:`10051`]->()-[:`10052`]->(node2)
WHERE p2 IS NOT NULL
RETURN p1, p2
It could also be that when you call nodes(path)
that the objects that you get aren't nodes as much as Map
s of the node properties. If that's so we should be able to match via the IDs:
MATCH p1=(x:Organization {PID: '27762230'})-[r:`10006`|`10010`*]-(y)
WITH p1, EXTRACT(node IN nodes(p1) | ID(node)) AS p1_node_ids
UNWIND p1_node_ids AS id1
UNWIND p1_node_ids AS id2
OPTIONAL MATCH p2=(node1)-[:`10004`]->()-[:`10051`]->()-[:`10052`]->(node2)
WHERE ID(node1) = id1 AND ID(node2) = id2 AND p2 IS NOT NULL
RETURN p1, p2
Upvotes: 0