Reputation: 2960
We've been having a number of problems with our solr search engine in our test environments. We have a solr cloud setup on version 4.6, single shard, 4 nodes. We see the CPU flat lines to 100% on the leader node for several hours, then the server starts to throw OutOfMemory errors, 'PERFORMANCE WARNING: Overlapping onDeckSearchers' starts appearing in the logs, the leaders enter recovery mode, the filter cache and query cache warmup times hit around 60 seconds (normally less than 2 secs), the leader node goes down, and we suffer a outage for the whole cluster for a few mins while it recovers and elects a new leader. We think we're hitting a number of solr bugs with the 4.6 and 4.x branch, and so are looking to move to 5.3. We also recently dropped our soft commit time down from 10 mins to 2 mins. I am seeing regular CPU spikes every 2 mins on all nodes, but the spikes are low, from 20-50% (max 100) on a 2 min cycle. When CPU's maxed out obviously I can't see those spikes. Hard commits are every 15 seconds, with opennewsearcher set to false. We have a heavy query and index load type of scenario.
I am wondering whether the frequent soft commits are having a significant effect on this issue, or whether the long auto warm times on the caches are caused by the other issues we are experiencing (cause or symptom)? We recently increased the indexing load on the server, but we need to address these issues in the test environment before we can promote to production.
Cache settings:
<filterCache class="solr.FastLRUCache"
size="5000"
initialSize="5000"
autowarmCount="1000"/>
<queryResultCache class="solr.LRUCache"
size="20000"
initialSize="20000"
autowarmCount="5000"/>
Upvotes: 3
Views: 11354
Reputation: 1183
We had this problem with Solr 4.10 (and, very rarely, 5.1). In our case, we were indexing quite frequently and commits were starting to become too close together. Sometimes our optimize command would run a bit longer than expected.
We solved it by making sure no indexing or commits occurred for at least ten minutes after the optimize operation started. We also auto warmed fewer queries for our caches. The following links will probably be useful to you if you haven't found them already:
Overlapping onDeckSearchers--Solr mailing list
Upvotes: 5