Reputation: 330
I'm using the LOAD CSV
command to import nodes and relationships in Neo4j. For better performance I'm using as well USING PERIODIC COMMIT
, because I use large files to import (+/- 50 millions of records in each file).
I want to know if is better use USING PERIODIC COMMIT 1000
or USING PERIODIC COMMIT 5000
or a bigger number of records used in a bulk for performance.
The fatest way is put a big number or the oposite?
Ps: I have a lot of free RAM memory in the machine.
Thanks
Upvotes: 2
Views: 2173
Reputation: 59
I have been working on something similar, my dataset contains some 700k data points.
I have seen that USING PERIODIC COMMIT 100000
is taking more time to insert the data points in the database than USING PERIODIC COMMIT 50000
.
So, in my case the smaller numbers are making my process faster and the larger the numbers are it throws an exception of not enough memory to perform current task
Upvotes: 2
Reputation: 16365
Big numbers will make the process faster. The reasoning is: a big number will results in less amount of commits. Consequently, a less amount of IO disk operations.
Example: With 1000 records and USING PERIODIC COMMIT 50
will results 20 write on disk operations (1000 records / 50). Changing to USING PERIODIC COMMIT 100
will results in 10 write on disk operations (1000 records / 100).
Upvotes: 2