Reputation: 373
I am trying to use Neo4J with neomodel to represent some graph relationships. However I have performance issues when I am trying to construct a graph with millions of nodes and relationships.
When I have graph with 10k nodes and 30k relationships among them, it takes 4:20s to import it it Neo4j. It takes 1:40 to create nodes and 2:40 to create relationships with calling foo.connect(bar)
. It's extremely slow.
When I have used batch api provided by neomodel, I am able to create all nodes in just 4s, but it doesn't affect the time needed for relationships creation.
Neomodel is using CYPHER queries to create relationships 1 by 1. So, I have decided to write my own queries, where I first match all nodes needed for creating 100 relationships and then I create those relationships. It happened once or twice that it finished in few seconds. In other cases it again takes minutes. When I use htop to see, what is going on, I can see, that 2 cores are fully utilized by neo4j database.
I have found following article: Import 10M Stack Overflow Questions into Neo4j In Just 3 Minutes which is using neo4j-import
, but I would like to avoid it.
I am using default configuration, except that I am using dbms.jvm.additional=-Xss256M
to be able to execute those batch relationships queries. I have unique index over property that I am using for node lookup. Before each experiment I delete all nodes and relationships.
Do you have any idea, how to speed it up?
Upvotes: 4
Views: 1670
Reputation: 41676
How many rels do your nodes have?
Usually I don't think that object mappers are good for mass insertions.
Please check out: https://medium.com/@mesirii/5-tips-tricks-for-fast-batched-updates-of-graph-structures-with-neo4j-and-cypher-73c7f693c8cc
Can you enable query logging for queries taking longer than 1 second and share the queries that neomodel generates?
dbms.jvm.additional=-Xss256M
is excessive. That means every thread allocates 256M memory, usually 2M is good enough for that.
Upvotes: 1