Nayan Soni
Nayan Soni

Reputation: 21

How to load entire Solr index into memory to increase performance?

My website gets 10 - 30 hits per second (including bot crawls). I indexed 6 millions records (from a mysql table) in Solr. When I retrieve 30 records using q=something and sort=random_, Solr takes 200 to 300 milliseconds to respond, sometimes 100ms.

I tried to improve retrieval using solr.RAMDirectoryFactory setting but I got an out of memory error. I know solr.RAMDirectoryFactory setting is not persistent. So, which is best option to increase caching and loading whole index into memory.

I am using Digital Ocean 8GB server for Solr.

Solr Settings..

 <filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>

 <queryResultCache class="solr.LRUCache"
                     size="512"
                     initialSize="512"
                     autowarmCount="0"/>

<documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>

Solr version:

solr-spec 7.2.1
solr-impl 7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01-10 00:54:21
lucene-spec 7.2.1
lucene-impl 7.2.1 b2b6438b37073bee1fca40374e85bf91aa457c0b - ubuntu - 2018-01

Arguments:

-DSTOP.KEY=solrrocks-DSTOP.PORT=7983-Djetty.home=/opt/solr/server-Djetty.port=8983-Dlog4j.configuration=file:/var/solr/log4j.properties-Dsolr.data.home=-Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf-Dsolr.install.dir=/opt/solr-Dsolr.jetty.https.port=8983-Dsolr.log.dir=/var/solr/logs-Dsolr.log.muteconsole-Dsolr.solr.home=/var/solr/data-Duser.timezone=UTC-XX:+CMSParallelRemarkEnabled-XX:+CMSScavengeBeforeRemark-XX:+ParallelRefProcEnabled-XX:+PrintGCApplicationStoppedTime-XX:+PrintGCDateStamps-XX:+PrintGCDetails-XX:+PrintGCTimeStamps-XX:+PrintHeapAtGC-XX:+PrintTenuringDistribution-XX:+UseCMSInitiatingOccupancyOnly-XX:+UseConcMarkSweepGC-XX:+UseGCLogFileRotation-XX:+UseParNewGC-XX:-OmitStackTraceInFastThrow-XX:CMSInitiatingOccupancyFraction=50-XX:CMSMaxAbortablePrecleanTime=6000-XX:ConcGCThreads=4-XX:GCLogFileSize=20M-XX:MaxTenuringThreshold=8-XX:NewRatio=3-XX:NumberOfGCLogFiles=9-XX:OnOutOfMemoryError=/opt/solr/bin/oom_solr.sh 8983 /var/solr/logs-XX:ParallelGCThreads=4-XX:PretenureSizeThreshold=64m-XX:SurvivorRatio=4-XX:TargetSurvivorRatio=90-Xloggc:/var/solr/logs/solr_gc.log-Xms512m-Xmx512m-Xss256k-verbose:gc

Thanks in advance

Upvotes: 2

Views: 1578

Answers (1)

kellyfj
kellyfj

Reputation: 6943

It's important to remember that with an 8GB server and Solr Heap set to 512M that Lucene (not Solr!) will use the rest of the available memory on the machine (minus what the O.S. needs etc.)

So let's say for example that the OS needs 512M of RAM, and your Solr Heap is 512M - then there's 7GB left for Lucene. If you are new to Solr and Lucene this is a great read on how Lucene's memory works.

How big are your indexes? You can check your /solr/data folders with du -h.

To clarify INCREASING Solr Heap will make the situation worse (there will be less memory for Lucene). To avoid swapping of RAM to disk you also need to turn off swap (see for example this).

There are lots of knobs and buttons in Solr and Lucene and your instance that need to be tweaked to help ensure your entire index is in memory. Even then remember that things like Java GC, CPU speed, memory speed, and having the index pre-warmed into memory will impact response time significantly.

To learn more see

Upvotes: 1

Related Questions