VarYaz
VarYaz

Reputation: 141

Merging two nodes running endlessly in Neo4j

I am new to neo4j, I am trying to merge two nodes as mentioned below:

MATCH (n:node2) MERGE (p:node1 {p.id:n.id}) ON CREATE SET p.column1=n.column1,p.column2=n.column2, p.column3=n.column3,p.column4=n.column4,p.column5=n.column5,p.column6=n.column6, p.column7=n.column7 ON MATCH SET p.column1=n.column1,p.column2=n.column2, p.column3=n.column3,p.column4=n.column4,p.column5=n.column5,p.column6=n.column6, p.column7=n.column7;

Node1 contains 2 million nodes with 8 properties and node2 contains 184000 nodes with 8 properties.

I am trying to merge node2 records with node1, but this merge runs endlessly. Is there any way to run this merge command in less time?

Upvotes: 2

Views: 91

Answers (2)

Charchit Kapoor
Charchit Kapoor

Reputation: 9284

Well to speed up the MERGE, you should create an index on node1 label nodes for the key id. Like this:

CREATE INDEX id_index IF NOT EXISTS FOR (n:node1) ON (n.id)

Secondly, your query can be simplified a bit, instead of manually setting each property like this, both in ON CREATE and ON MERGE:

MATCH (n:node2) 
MERGE (p:node1 {id:n.id}) 
ON CREATE SET p.column1=n.column1,p.column2=n.column2, p.column3=n.column3,
p.column4=n.column4,p.column5=n.column5,p.column6=n.column6, p.column7=n.column7 
ON MATCH SET p.column1=n.column1,p.column2=n.column2, 
p.column3=n.column3,p.column4=n.column4,
p.column5=n.column5,p.column6=n.column6, p.column7=n.column7;

You can simply do this:

MATCH (n:node2) 
MERGE (p:node1 {id:n.id}) 
SET p += properties(n)

Upvotes: 2

jose_bacoy
jose_bacoy

Reputation: 12684

You are copying node2 into node1 when node2.id does not exists in node1 and update column values (properties) in node1 when found in node2. In short, you need to find node2 not found in node1 then do the merge.

MATCH (n:node2) 
WHERE NOT EXISTS( (:node1 {id: n.id})--() ) 
WITH n
MERGE (p:node1 {id:n.id})    //I fix a typo error. It should be id and not p.id
SET p += properties(n)

If the execution time is still long running, install APOC function and run below. It will do the merge by batch of 10k.

call apoc.periodic.iterate(
    "
    MATCH (n:node2) WHERE NOT EXISTS( (:node1 {id: n.id})--() ) 
    RETURN n
    ",
    "
    WITH n
    MERGE (p:node1 {id:n.id})
    SET p += properties(n)
    ",
    {batchSize:10000}
)​

Upvotes: 2

Related Questions