camoflage
camoflage

Reputation: 177

delete duplicate node and their relations in neo4j

The cypher query MATCH (n:BusinessBranch) RETURN n returns all the nodes and I want to delete the duplicate nodes and their relations based on the property address. How do I do that?

Upvotes: 5

Views: 6909

Answers (1)

cybersam
cybersam

Reputation: 66947

[UPDATED]

  1. To delete all BusinessBranch nodes that share the same address property value (which would also require deleting all their relationships):

    MATCH (b:BusinessBranch)
    WITH b.address AS address, COLLECT(b) AS branches
    WHERE SIZE(branches) > 1
    FOREACH (n IN branches | DETACH DELETE n);
    

    This query collects all the BusinessBranch nodes that have the same address, filters for collections that have more than one branch, and then uses DETACH DELETE on all the branches in the resulting collections (which will delete the branches and their relationships).

  2. To delete all but one of the duplicate nodes, you could do this:

    MATCH (b:BusinessBranch)
    WITH b.address AS address, COLLECT(b) AS branches
    WHERE SIZE(branches) > 1
    FOREACH (n IN TAIL(branches) | DETACH DELETE n);
    

    However, in this case you should first take a look at the APOC procedure apoc.refactor.mergeNodes, which is more appropriate for most use cases.

Upvotes: 17

Related Questions