bahzad
bahzad

Reputation: 9

How Can speed up Neo4j Importing CSV FILE

I'm using the code below to import the data, but it takes a very long time, knowing that the size of the data is not large only (3.3 megabytes). Is it possible to modify the code or use another method to speed up the data import process?

code: LOAD CSV WITH HEADERS FROM 'file:///Musae-Github.csv' as line WITH toInteger(line.source) AS Source, toInteger(line.destination) AS Destination MERGE (a:person {name:Source}) MERGE (b:person {name:Destination}) MERGE (a)-[:Freind ]-(b) RETURN *

Upvotes: 0

Views: 480

Answers (2)

jose_bacoy
jose_bacoy

Reputation: 12684

You can use transaction batching to speed up importing csv files to Neo4j. Also, ensure that you have index on person.name

CREATE INDEX PersonNameIndex IF NOT EXISTS FOR (p:Person) ON (p.name)

Below is the updated script:

CALL apoc.periodic.iterate('
CALL apoc.load.csv('file:///Musae-Github.csv') yield map as line return line
','
WITH toInteger(line.source) AS Source, toInteger(line.destination) AS Destination  MERGE (a:Person {name:Source}) MERGE (b:Person {name:Destination}) MERGE (a)-[:Friend ]-(b)
', {batchSize:10000, iterateList:true, parallel:true});

reference: https://neo4j.com/labs/apoc/4.3/import/load-csv/#_transaction_batching

Upvotes: 1

Vincent Rupp
Vincent Rupp

Reputation: 655

Your code looks great, although I don't know if you need to RETURN * at the end.

Try seeing how long it takes just to process the file:

LOAD CSV FROM "yourpathhere" as line
return linenumber(), datetime(), line

This should be very fast.

Upvotes: 0

Related Questions