Reputation: 187
I want to delete a connected graph related to a particular node in a Neo4j database using Cypher. The use case is to delete a "start" node and all the nodes where a path to the start node exists. To limit the transaction the query has to be iterative and must not disconnect the connected graph.
Until now I am using this query:
OPTIONAL MATCH (start {indexed_prop: $PARAM})--(toDelete)
OPTIONAL MATCH (toDelete)--(toBind)
WHERE NOT(id(start ) = id(toBind)) AND NOT((start)--(toBind))
WITH start, collect(toBind) AS TO_BIND, toDelete limit 10000
DETACH DELETE toDelete
WITH start, TO_BIND
UNWIND TO_BIND AS b
CREATE (start)-[:HasToDelete]->(b)
And call it until deleted node is equal to 0.
Is there a better query for this ?
Upvotes: 1
Views: 214
Reputation: 30397
You could try a mark and delete approach, which is similar to how you would detach and delete the entire connnected graph with a variable match, but instead of DETACH DELETE you can apply a :TO_DELETE label.
Something like this (making up a label to use for the start node, as otherwise it has to comb the entire db looking for a node with the indexed param):
MATCH (start:StartNodeLabel {indexed_prop: $PARAM})-[*]-(toDelete)
SET toDelete:TO_DELETE
If that blows up your heap, you can run it multiple times, with the added predicate WHERE NOT toDelete:TO_DELETE
before the SET, and using a combination of LIMIT and/or a limit on the depth of the variable-length relationship.
When you're sure you've labeled every connected node, then it's just a matter of deleting every node in the TO_DELETE label, and you can run that iteratively, or use APOC procedure apoc.periodic.commit() to handle that in batches.
Upvotes: 2