raf
raf

Reputation: 147

Neo4j: what is the most efficient solution to import csv?

I have the following code and I need to use (if exists) a code more efficient because I have a lot of rows in my csv and Neo4j takes too much time to add all rows.

using periodic commit 1000
load csv with headers from "file:///registry_office.csv" as f 
 fieldterminator "|" 
 WITH f AS a
WHERE NOT a.JobName IS NULL and NOT a.JobCode IS NULL and NOT 
 a.JobDescription IS NULL and NOT a.JobLongDescription IS NULL 
 AND NOT a.Long_Description IS NULL AND NOT a.Position IS NULL 
 AND NOT a.birthDate IS NULL AND NOT a.startWorkingDate IS NULL
merge (b:Job{Name:a.JobName, Code:a.JobCode, Job:a.JobDescription, 
 JobLongDescription:a.JobLongDescription})
merge (c:Person{PersonName:a.PersonName, PersonSurname:a.PersonSurname, 
 CF:a.CF, birthDate:a.birthDate, address:a.address, age:a.age, 
 married:a.married, birthPlace:a.a.birthPlace})
merge (b)<-[:RELATED_TO{startWorkingDate:a.startWorkingDate, 
 JobPosition:a.Position}]-(c) 
return *;

Do you have any suggestions for me?

Upvotes: 0

Views: 37

Answers (1)

Izhaki
Izhaki

Reputation: 23586

The import tool is generally much faster than LOAD CSV.

However, your query suggests that each csv row ends as a pattern (b)<--(c), so you'd need to to some pre-processing on this csv... first filter null values, then split into 3 csvs (2 for nodes, 1 for relationships).

To do this, you have 3 main options:

  • Excel - not viable for huge CSVs
  • CLI tool - something like csvkit
  • Program - if you are OK with Python or JavaScript, you'll be able to do this in 20m or so.

Upvotes: 1

Related Questions