jack
jack

Reputation: 3

Neo4j: skip nodes that have even just a single relationship matching a query

The scenario is the following:

Example input:

CREATE (a:x {name: 'a'}), (b:x {name: 'b'}), (c:x {name: 'c'});

CREATE (d:y {name: 'd', attrib: 1}), (e:y {name: 'e', attrib: 2}),
       (f:y {name: 'f', attrib: 3}), (g:y {name: 'g', attrib: 4}),
       (h:y {name: 'h', attrib: 5}), (i:y {name: 'i', attrib: 6});

MATCH (a), (d), (e) WHERE a.name = 'a' AND d.name = 'd' AND e.name = 'e'
CREATE (a)-[r:z]->(d), (a)-[s:z]->(e) RETURN *;

MATCH (b), (f), (g) WHERE b.name = 'b' AND f.name = 'f' AND g.name = 'g'
CREATE (b)-[r:z]->(f), (b)-[s:z]->(g) RETURN *;

MATCH (c), (h), (i) WHERE c.name = 'c' AND h.name = 'h' AND i.name = 'i'
CREATE (c)-[r:z]->(h), (c)-[s:z]->(i) RETURN *;

Here I want to return all the x nodes except those that are linked to a y node that has attrib = 5.

Here's what I tried:

MATCH (n:x)-[]-(m:y) WHERE NOT m.attrib = 5 RETURN n

From this query I get all x nodes, that is: a, b and c. I would like to exclude c, because it's linked to h, which has h.attrib = 5.

Edit:

I found a query that does the job:

MATCH (n:x), (m:x)-[]-(o:y)
WHERE o.attrib = 5
WITH collect(n) as all_x_nodes, collect(m) as bad_x_nodes
RETURN [n IN all_x_nodes WHERE NOT n IN bad_x_nodes]

The problem is that it's not efficient. Any better alternative?

Upvotes: 0

Views: 660

Answers (2)

InverseFalcon
InverseFalcon

Reputation: 30397

A better approach is to find all :x nodes that you want to exclude (that are connected to the :y node with the specific attribute), collect those x nodes, then match to all :x nodes that aren't in the collection:

MATCH (exclude:x)--(:y{attrib:5})
WITH collect(distinct exclude) as excluded
MATCH (n:x)
WHERE NOT n in excluded
RETURN collect(n) as result

An alternate approach using APOC Procedures is to get both collections, and subtract the excluded collection from the other:

MATCH (exclude:x)--(:y{attrib:5})
WITH collect(distinct exclude) as excluded
MATCH (n:x)
WITH excluded, collect(n) as nodes
RETURN apoc.coll.subtract(nodes, excluded) as result

In either case, it would help to have an index on :y(attrib). In this data set it doesn't matter. On much larger sets it will.

Upvotes: 0

cybersam
cybersam

Reputation: 66967

This simple query should do exactly what you asked for: "return all the x nodes except those that are linked to a y node that has attrib = 5."

MATCH (n:x)
WHERE NOT (n)--(:y {attrib: 5})
RETURN n;

Upvotes: 1

Related Questions