Reputation: 1303
I am using Neo4j Procedure to create relationships on bulk data.
Initially insert that all data using load csv.
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:///XXXX.csv" AS row
....
data size is too large[10M] but its successfully executed
my problem is i want to create relationships between this all nodes many-many
but i got exception [OutMemoryException] while executing queries
MATCH(n1:x{REMARKS :"LATEST"}) MATCH(n2:x{REMARKS :"LATEST"}) WHERE n1.DIST_ID=n2.ENROLLER_ID CREATE (n1)-[:ENROLLER]->(n2) ;
I have already created Indexing and Constraints also
Any idea please help me?
Upvotes: 0
Views: 92
Reputation: 29167
The problem is that your query is performed in one transaction, which leads to the exception [OutMemoryException]
. And this is a problem, since at this moment the possibility of periodic transactions only have to load the CSV. So, you can, for example, re-read the CSV after first load:
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:///XXXX.csv" AS row
MATCH (n1:x{REMARKS :"LATEST", DIST_ID: row.DIST_ID})
WITH n1
MATCH(n2:x{REMARKS :"LATEST"}) WHERE n1.DIST_ID=n2.ENROLLER_ID
CREATE (n1)-[:ENROLLER]->(n2) ;
Or try the trick with periodic committing from the APOC library
:
call apoc.periodic.commit("
MATCH (n2:x {REMARKS:'Latest'}) WHERE exists(n2.ENROLLER_ID)
WITH n2 LIMIT {perCommit}
OPTIONAL MATCH (n1:x {REMARKS:'Latest'}) WHERE n1.DIST_ID = n2.ENROLLER_ID
WITH n2, collect(n1) as n1s
FOREACH(n1 in n1s|
CREATE (n1)-[:ENROLLER]->(n2)
)
REMOVE n2.ENROLLER_ID
RETURN count(n2)",
{perCommit: 1000}
)
P.S. ENROLLER_ID
property is used as a flag for selecting nodes for processing. Of course, you can use another flag, which is set in the processing.
Or a more accurate with apoc.periodic.iterate
:
CALL apoc.periodic.iterate("
MATCH (n1:x {REMARKS:'Latest'})
MATCH (n2:x {REMARKS:'Latest'}) WHERE n1.DIST_ID = n2.ENROLLER_ID
RETURN n1,n2
","
WITH {n1} as n1, {n2} as n2
MERGE (n1)-[:ENROLLER]->(n2)
", {batchSize:10000, parallel:true}
)
Upvotes: 1