Chris Beach
Chris Beach

Reputation: 4392

Neo4J "GC Monitor: Application threads blocked for 10000ms"

Note - there's a similar issue to this elsewhere on Stackoverflow, but it was related to Spring Data, and I'm not using Spring Data.

I have a simple social graph built from Twitter data. ~120,000 nodes and ~200,000 relationships so far.

The performance of Neo4J seems to be a bit hit-and-miss, with queries like this occasionally taking 200 secs:

MATCH p=(:User {twId: 838853137247141888})-[:FOLLOWS*0..3]->(:User {twId: 40002648}) 
RETURN SUM(REDUCE(s = 1.0, n IN NODES(p)[0..-1] | s / SIZE((n)-->()))) AS connectedness

I've looked in logs/debug.log and note a regular stream of the following, even when there are no queries taking place on the graph:

2017-05-27 18:50:40.041+0000 WARN  [o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 2436ms.
2017-05-27 18:50:46.831+0000 WARN  [o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 5705ms.
2017-05-27 18:50:55.631+0000 WARN  [o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 8699ms.
2017-05-27 18:50:56.450+0000 WARN  [o.n.k.i.c.MonitorGc] GC Monitor: Application threads blocked for 719ms.

My .neo4j-community.vmoptions contains only the following

-Xmx6G

(I tried a large heap to see if this would solve the problem - it didn't)

I'm running Neo4J Community v3.1.3 on MacOS Sierra 10.12.4

To be honest I'm not sure where to start when it comes to profiling Neo or working out what the server is up to, and the documentation hasn't been very helpful with my particular issue.

Tips much appreciated.

Update:

I'm also seeing the following in my debug.log on startup

2017-05-27 19:23:06.439+0000 ERROR [o.n.k.a.i.s.LuceneSchemaIndexProvider] Failed to open index:3, requesting re-population. Lock held by this virtual machine: /Users/chris/social-graph/schema/index/lucene/3/1/write.lock
org.apache.lucene.store.LockObtainFailedException: Lock held by this virtual machine: /Users/chris/social-graph/schema/index/lucene/3/1/write.lock
        at     org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:127)
        at     org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41)
        at     org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45)

Update:

Indices

Query Plan

Showing all JVM args: JVisualVM Overview

Oddly the VisualVM output doesn't tally with the reported 10sec GCs in debug.log JVisualVM output

Upvotes: 2

Views: 1492

Answers (2)

InverseFalcon
InverseFalcon

Reputation: 30417

As far as tuning the query itself, you may want to use index hints to force the plan to match on both nodes first before running the expansion, instead of the default behavior of expanding from one node then filtering the end node.

See how this works for you, both in terms of speed and PROFILE:

MATCH (start:User {twId: 838853137247141888}), (end:User {twId: 40002648}) 
USING INDEX start:User(twId)
USING INDEX end:User(twId)
MATCH p=(start)-[:FOLLOWS*0..3]->(end)
RETURN SUM(REDUCE(s = 1.0, n IN NODES(p)[0..-1] | s / SIZE((n)-->()))) AS connectedness

Note that this may not work in Neo4j 3.2, I think they've removed the RULE planner, which is needed to take advantage of the index hints.

EDIT

There IS a way around the above 3.2 restriction...it won't be as performant as the above query on 3.1.x (according to profiled db hits), but it should be more performant than the original query.

MATCH (start:User {twId: 838853137247141888}), (end:User {twId: 40002648}) 
MATCH p=(start)-[:FOLLOWS*0..3]->(x)
WHERE x = end
RETURN SUM(REDUCE(s = 1.0, n IN NODES(p)[0..-1] | s / SIZE((n)-->()))) AS connectedness

Upvotes: 1

Jerome_B
Jerome_B

Reputation: 1097

Tips: jvisualvm will give you a lot of info on your running jvms, like which GC is used (in the overview tab, maybe you'll have-XX:+UseG1GC).

Upvotes: 1

Related Questions