Yuval
Yuval

Reputation: 317

NEO4J - Matching a path where middle node might exist or not

I have the following graph:

enter image description here

I would look to get all contractors and subcontractors and clients, starting from David. So I thought of a query likes this:

MATCH (a:contractor)-[*0..1]->(b)-[w:works_for]->(c:client) return a,b,c

This would return:

(0:contractor {name:"David"})   (0:contractor {name:"David"})   (56:client {name:"Sarah"})
(0:contractor {name:"David"})   (1:subcontractor {name:"John"}) (56:client {name:"Sarah"})

Which returns the desired result. The issue here is performance. If the DB contains millions of records and I leave (b) without a label, the query will take forever. If I add a label to (b) such as (b:subcontractor) I won't hit millions of rows but I will only get results with subcontractors:

(0:contractor {name:"David"}) (1:subcontractor {name:"John"}) (56:client {name:"Sarah"})

Is there a more efficient way to do this?

link to graph example: https://console.neo4j.org/r/pry01l

Upvotes: 0

Views: 66

Answers (1)

Luanne
Luanne

Reputation: 19373

There are some things to consider with your query. The relationship type is not specified- is it the case that the only relationships from contractor nodes are works_for and hired? If not, you should constrain the relationship types being matched in your query. For example

MATCH (a:contractor)-[:works_for|:hired*0..1]->(b)-[w:works_for]->(c:client) 
RETURN a,b,c

The fact that (b) is unlabelled does not mean that every node in the graph will be matched. It will be reached either as a result of traversing the works_for or hired relationships if specified, or any relationship from :contractor, or via the works_for relationship.

If you do want to label it, and you have a hierarchy of types, you can assign multiple labels to nodes and just use the most general one in your query. For example, you could have a label such as ExternalStaff as the generic label, and then further add Contractor or SubContractor to distinguish individual nodes. Then you can do something like

MATCH (a:contractor)-[:works_for|:hired*0..1]->(b:ExternalStaff)-[w:works_for]->(c:client) 
RETURN a,b,c

Depends really on your use cases.

Upvotes: 1

Related Questions