stolen_leaves
stolen_leaves

Reputation: 1462

For embedded dbs, is the Db loaded to main memory in neo4j?

I was trying to parse a large file and create nodes for it in the neo4j db. I use map reduce and hence load the following line for every reduce call.

 GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabase(DB_PATH);

Now, this line gives me the following exception after running for some time-

java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.neo4j.io.pagecache.impl.muninn.MuninnPageCache.<init>(MuninnPageCache.java:230)
    at org.neo4j.kernel.impl.pagecache.ConfiguringPageCacheFactory.createPageCache(ConfiguringPageCacheFactory.java:63)
    at org.neo4j.kernel.impl.pagecache.ConfiguringPageCacheFactory.getOrCreatePageCache(ConfiguringPageCacheFactory.java:56)
    at org.neo4j.kernel.InternalAbstractGraphDatabase.createPageCache(InternalAbstractGraphDatabase.java:704)
    at org.neo4j.kernel.InternalAbstractGraphDatabase.create(InternalAbstractGraphDatabase.java:473)
    at org.neo4j.kernel.InternalAbstractGraphDatabase.run(InternalAbstractGraphDatabase.java:321)
    at org.neo4j.kernel.EmbeddedGraphDatabase.<init>(EmbeddedGraphDatabase.java:59)
    at org.neo4j.graphdb.factory.GraphDatabaseFactory.newDatabase(GraphDatabaseFactory.java:108)
    at org.neo4j.graphdb.factory.GraphDatabaseFactory$1.newDatabase(GraphDatabaseFactory.java:95)
    at org.neo4j.graphdb.factory.GraphDatabaseBuilder.newGraphDatabase(GraphDatabaseBuilder.java:176)
    at org.neo4j.graphdb.factory.GraphDatabaseFactory.newEmbeddedDatabase(GraphDatabaseFactory.java:67)
    at com.kchakrab.BaseGraph.CreateBaseGraphReducer.reduce(CreateBaseGraphReducer.java:29)
    at com.kchakrab.BaseGraph.CreateBaseGraphReducer.reduce(CreateBaseGraphReducer.java:21)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:572)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:414)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:392)

So, can some one guide me as to what am doing wrong? Is the whole database loaded every time to the memory and the load for every reduce method is causing the GC overhead?

Upvotes: 1

Views: 237

Answers (1)

Stefan Armbruster
Stefan Armbruster

Reputation: 39925

Neo4j 2.2 uses by default up to 75% of your available RAM minus heap size for the page cache. Depending on your setup that might be too much.

You should tweak dbms.pagecache.memory to a reasonable value.

Example: assume you have 16 GB RAM. The JVM defaults to 25 % (=4GB) heap size. From the rest (12GB), 75 % (=9GB) are used for page cache leaving 3 GB for OS and other applications. The default of 75% is a reasonable choice for server systems running only neo4j. If the machine does other stuff as well (desktop, other server processes) maybe set dbms.pagecache.memory to e.g. 5 GB.

JVM heap size can we configured in neo4j-wrapper.conf.

Upvotes: 2

Related Questions