WildBill
WildBill

Reputation: 9291

How to find specific subgraph in Neo4j using where clause

I have a large graph where some of the relationships have properties that I want to use to effectively prune the graph as I create a subgraph. For example, if I have a property called 'relevance score' and I want to start at one node and sprawl out, collecting all nodes and relationships but pruning wherever a relationship has the above property.

My attempt to do so netted this query:

start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r

My attempt has two issues I cannot resolve:

1) Reflecting I believe this will not result in a pruned graph but rather a collection of disjoint graphs. Additionally:

2) I am getting the following error from what looks to be a correctly formed cypher query:

Type mismatch: expected Any, Map, Node or Relationship but was Collection<Relationship> (line 1, column 52 (offset: 51))
"start n=node(15) match (n)-[r*]->(x) WHERE NOT HAS(r.relevance_score) return x, r"

Upvotes: 2

Views: 606

Answers (2)

InverseFalcon
InverseFalcon

Reputation: 30397

You should be able to use the ALL() function on the collection of relationships to enforce that for all relationships in the path, the property in question is null.

Using Gabor's sample graph, this query should work.

MATCH p = (n {name: 'n1'})-[rs1*]->()
WHERE ALL(rel in rs1 WHERE rel.relevance_score is null)
RETURN p

Upvotes: 3

Gabor Szarnyas
Gabor Szarnyas

Reputation: 5047

One solution that I can think of is to go through all relationships (with rs*), filter the the ones without the relevance_score property and see if the rs "path" is still the same. (I quoted "path" as technically it is not a Neo4j path).

I created a small example graph:

CREATE
  (n1:Node {name: 'n1'}),
  (n2:Node {name: 'n2'}),
  (n3:Node {name: 'n3'}),
  (n4:Node {name: 'n4'}),
  (n5:Node {name: 'n5'}),
  (n1)-[:REL {relevance_score: 0.5}]->(n2)-[:REL]->(n3),
  (n1)-[:REL]->(n4)-[:REL]->(n5)

The graph contains a single relevant edge, between nodes n1 and n2.

enter image description here

The query (note that I used {name: 'n1'} to get the start node, you might use START node=...):

MATCH (n {name: 'n1'})-[rs1*]->(x)
UNWIND rs1 AS r
WITH n, rs1, x, r
WHERE NOT exists(r.relevance_score)
WITH n, rs1, x, collect(r) AS rs2
WHERE rs1 = rs2
RETURN n, x

The results:

╒══════════╤══════════╕
│n         │x         │
╞══════════╪══════════╡
│{name: n1}│{name: n4}│
├──────────┼──────────┤
│{name: n1}│{name: n5}│
└──────────┴──────────┘

Update: see InverseFalcon's answer for a simpler solution.

Upvotes: 2

Related Questions