Cypher: preventing results from duplicating on WITH / sequential querying

Question

In a query like this

MATCH (a)
WHERE id(a) = {x}

MATCH (a)-->(b:x)

WITH a, collect(DISTINCT id(b)) AS Bs

MATCH (a)-->(c:y)

RETURN collect(c) + Bs

what I'm trying to do is to gather two sets of nodes that came from different queries, but with this kind of procedure all the b rows get to be returned multiplied by the number of a rows.

How should I deal with this kind of problem that arises from sequential queries?

[Note that the reported query is only a conceptual representation of what I mean. Please don't try to solve the code (that would be trivial) but only the presented problem.]

jjaderberg · Accepted Answer

Your query shouldn't return any cross product since you aggregate in the WITH clause, so there is only one result item/row (the disconnected path a, collect(b)) when the second match begins. It's not clear therefore what the problem is that you want solved–cross products can be solved differently in different cases.

The way your query would work, conceptually speaking, is: match anything related from a, then filter that anything on having label :x. The second leg of the query does the same but filters on label :y. You can therefore combine your queries as

MATCH (a)-->(b)
WHERE id(a) = {x} AND (b:x OR b:y)
RETURN b

Other cases of 'path explosion' can't be solved as easily (sometimes UNION is good, sometimes you can reorder your pattern, sometimes you can do some aggregate-and-reduce to make it happen) , but you'll have to ask about that separately.

Cypher: preventing results from duplicating on WITH / sequential querying

Answers (2)

Related Questions