Reputation: 147
I'm trying to import data from csv with 2 billions of records into a Neo4J, now I'm using the following query (in my real query I have 40 properties and 5 type of nodes):
call apoc.periodic.commit("LOAD CSV with headers from 'file:///person_job.csv' as
row fieldterminator '|' WITH row as a
WHERE NOT a.id IS NULL
MERGE (b:Person{id:a.id})
MERGE (c:Job{type:a.type})
MERGE (b)<-[:RELATED_TO]-(c)",{limit:2000000});
I created index on id and on type but now this query needs five days to finish. Do you have any idea how to improve the efficiency of this query?
Upvotes: 0
Views: 76
Reputation: 878
If this is a one-time load or an initial load, you should use Neo4j-Import. 2M is a large commit. Ensure you have a large HEAP size to handle that.
Upvotes: 1