Reputation: 1672
There is a tree structure stored in neo4j database. Need to delete a node with all the child nodes. I can propose two approaches so far:
Can one estimate the effectiveness of these approaches and determine which one is better (faster) without the benchmark?
Upvotes: 0
Views: 540
Reputation: 66989
Approach #2, which uses threads to concurrently delete nodes/relationships in the same subgraph, is prone to errors, and should be avoided.
When deleting a relationship, neo4j's default locking mechanism will lock the relationship AND its endpoints; this can cause deadlock errors when multiple threads concurrently attempt to delete nodes/relationships in the same subgraph.
Also, a thread may discover that a node/relationship it is attempting to work on has disappeared (due to the actions of other threads).
Here is a sample Cypher query that uses approach #1. It should find all distinct nodes in a Foo/BAR
tree and delete the tree (using DETACH DELETE
, which @ToreEschliman also suggested):
[EDITED]
MATCH p=(a:Foo {id: 123})-[:BAR*0..]->(b:Foo)
WITH COLLECT(b) AS ns1
UNWIND ns1 AS n
WITH COLLECT(DISTINCT n) AS ns2
FOREACH(y IN ns2 | DETACH DELETE y);
[UPDATE]
Based on new info from the comments, here is how to delete the entire tree rooted at a specific CodeSet
node:
MATCH p=(root:CodeSet {id: 123})<-[*0..]-(node)
DETACH DELETE p;
The MATCH
pattern used assumes that all the descendant nodes are connected via relationships directed towards the root node.
Upvotes: 1
Reputation: 2507
You can estimate, yes, and you should estimate that the first one is always faster. If you can write a query that identifies all your bad nodes, just do so, then DETACH DELETE
those nodes at the end. One transaction, one Cypher translation, and then the rest is handled in specialized, purpose-built database code. If you can come up with a faster way to do it at application level, you should be writing a competing database.
Upvotes: 1