Daniel
Daniel

Reputation: 117

How to get all connected nodes, excluding specific relationships

I'm looking for a performant way of retrieving all connected nodes. However there is a twist. I would like to exclude nodes and consequent children, that are connected via certain relationship types.

The attached figure illustrates my case.

There are two or more clusters of nodes. I would like to retrieve all nodes of a single cluster, depending on the id inside the query. All other nodes (coming from different clusters) and connected via "LINK..." relations shall not be included.

I know how to retrieve all connected nodes via:

MATCH (n:MyNode {id : 123})-[*]-(connectedNodes) RETURN connectedNodes

Filtering with the WHERE clause sounds like a bad idea, because it would still fetch the whole graph. Is there maybe something inside the APOC procedures, that would allow me to do something in that manner? Thanks a lot already for your help.

EDIT 1: sofar I tried the first suggestion given in the comments but the execution time was not sufficient. I will try to restrict relationahip and node types afterall. Also I tried a custom implementation inside Python using a recursive function. Not finalized yet though.

EDIT 2: @InverseFalcon's suggestion worked liked a charm. First filter all available relationship types for the once that shall not be considered and then applying the apoc.path.subgraphNodes procedure with the respective starting node and the valid relationship types. Thank you. enter image description here

Upvotes: 2

Views: 2163

Answers (2)

Tezra
Tezra

Reputation: 8833

First, I want to stress that Cypher does not restrict how information is retrieved, it only determines what is returned. So try using WHERE before ruling it out (Also, try upgrading to the latest Neo4j for the smartest cypher planner). This should work just fine because the cypher planner can filter the results while it matches them.

MATCH (n:MyNode {id : 123})-[rs*]-(connectedNodes)
WHERE NONE(r in rs WHERE TYPE(r)="LINK")
RETURN DISTINCT connectedNodes

The APOC procedures I can think of require you to name the relationships used (you can black list labels, but doesn't seem to apply to relation types), so would be the same as -[rs:A|B|C|D*]-

Upvotes: 1

InverseFalcon
InverseFalcon

Reputation: 30417

Tezra's answer has some good points, and you'll want to return DISTINCT connectedNodes otherwise you'll get duplicates, but on a highly connected graph this may take awhile (or even hang) depending on the number of nodes, since Cypher is interested in all possible paths for matches, and that can quickly get out of control.

For APOC we can handle this case, but as Tezra remarked we don't have a way to blacklist relationships, and even if we had that, we don't have a way to blacklist based on partial names of the relationship types.

The approach you would need to use is to get all relationship types first then remove any which start with LINK, then join the list of remaining relationships into an | separated string. Then you could pass that to the relationship filter.

CALL db.relationshipTypes() YIELD relationshipType
WHERE NOT relationshipType STARTS WITH 'LINK'
WITH collect(relationshipType) as relTypes
WITH apoc.text.join(relTypes, '|') as relTypesString
MATCH (n:MyNode {id : 123})
CALL apoc.path.subgraphNodes(n, {relationshipFilter:relTypesString}) YIELD node
RETURN node as connectedNode

Upvotes: 3

Related Questions