raf
raf

Reputation: 147

Neo4J: How load two billions of records from a csv?

I'm trying to import data from csv with 2 billions of records into a Neo4J, now I'm using the following query (in my real query I have 40 properties and 5 type of nodes):

call apoc.periodic.commit("LOAD CSV with headers from 'file:///person_job.csv' as 
row fieldterminator '|' WITH row as a 
WHERE NOT a.id IS NULL
MERGE (b:Person{id:a.id})  
MERGE (c:Job{type:a.type})
MERGE (b)<-[:RELATED_TO]-(c)",{limit:2000000});

I created index on id and on type but now this query needs five days to finish. Do you have any idea how to improve the efficiency of this query?

Upvotes: 0

Views: 76

Answers (1)

Dave Fauth
Dave Fauth

Reputation: 878

If this is a one-time load or an initial load, you should use Neo4j-Import. 2M is a large commit. Ensure you have a large HEAP size to handle that.

Upvotes: 1

Related Questions