Adding a property filter to cypher query explodes memory, why?

Question

I'm trying to write a query that explores a DAG-type graph (a bill of materials) for all construction paths leading down to a specific part number (second MATCH), among all the parts associated with a given product (first MATCH). There is a strange behavior I don't understand:

This query runs in a reasonable time using Neo4j community edition (~2 s):

WITH '12345' as snid, 'ABCDE' as pid
MATCH (m:Product {full_sn:snid})-[:uses]->(p:Part)
WITH snid, pid, collect(p) AS mparts
MATCH path=(anc:Part)-[:has*]->(child:Part)
WHERE ALL(node IN nodes(path) WHERE node IN mparts)
WITH snid, path, relationships(path)[-1] AS rel, 
nodes(path)[-2] AS parent, nodes(path)[-1] AS child
RETURN stuff I want

However, to get the query I want, I must add a filter on the child using the part number pid in the second MATCH statement:

MATCH path=(anc:Part)-[:has*]->(child:Part {pn:pid})

And when I try to run the new query, neo4j browser compains that there is not enough memory. (Neo.TransientError.General.OutOfMemoryError). When I run it with EXPLAIN, the db hits are exploding into the 10s of billions, as if I'm asking it for a massive cartestian product: but all I have done is added a restriction on the child, so this should be reducing the search space, shouldn't it?

I also tried adding an index on :Part(pn). Now the profile shown by EXPLAIN looks very efficient, but I still have the same memory error.

If anyone can help me understand why this change between the two queries is causing problems, I'd greatly appreciate it!

Best wishes,

Ben

Adding a property filter to cypher query explodes memory, why?

Answers (1)

Related Questions