Reputation: 11

Graphdb's loadrdf tool loads ontology and data very slow

I am using GraphDB loadrdf tool to load an ontology and a fairly big data. I set pool.buffer.size=800000 and jvm -Xmx to 24g. I tried both parallel and serial modes. They both slow down once the repo total statements go over about 10k. It eventually slows down to 1 or 2 statements/second. Does anyone know if this is a normal behavior of loadrdf or there's a way to optimize the performance?

Edit I have increased tuple-index-memory. See part of my repo ttl configuration:

owlim:entity-index-size "45333" ; 
owlim:cache-memory "24g" ; 
owlim:tuple-index-memory "20g" ; 
owlim:enable-context-index "false" ; 
owlim:enablePredicateList "false" ; 
owlim:predicate-memory "0" ;  
owlim:fts-memory "0" ; 
owlim:ftsIndexPolicy "never" ; 
owlim:ftsLiteralsOnly "true" ; 
owlim:in-memory-literal-properties "false" ; 
owlim:transaction-mode "safe" ; 
owlim:transaction-isolation "true" ; 
owlim:disable-sameAs "true";

But somehow the process still slows down. It starts with "Global average rate: 1,402 st/s". But slows down to "Global average rate: 20 st/s" after "Statements in repo: 61,831". I give my jvm: -Xms24g -Xmx36g

Upvotes: 1

Answers (2)

Venelin

Reputation: 66

I've looked at you repository configuration ttl. There is this parameter: entity-index-size=45333 whose value needs to be increased, e.g. set it to 100 million (entity-index-size=100000000). Default value for that parameter in GraphDB 7 is 10M, but since you've set it explicitly it gets overriden.

You can read more about that parameter here

Upvotes: 0

nikolavp

Reputation: 173

can you please post your repository configuration? Inside it, there is a parameter tuple-index-memory - this will determine the amount of changes(disc pages) that we are allowed to keep in memory. The bigger this value is the smaller amount of flushes we are going to do.

Check if this is set to a value like 20G in your setup and retry the process again.

Upvotes: 1

Graphdb&#39;s loadrdf tool loads ontology and data very slow

Answers (2)

Related Questions

Graphdb's loadrdf tool loads ontology and data very slow