Reputation: 9
I have a Master-Repeater-Slave configuration. Master/Slaves/Repeaters is setup with this replication configs <str name="replicateAfter">optimize</str>
, full config below
<requestHandler name="/replication" class="solr.ReplicationHandler">
<str name="commitReserveDuration">01:00:00</str>
<lst name="master">
<str name="enable">${Project.enable.master:false}</str>
<str name="replicateAfter">${Project.master.setReplicateAfterCommit:}</str>
<str name="replicateAfter">${Project.master.setReplicateAfterStartup:}</str>
<str name="replicateAfter">optimize</str>
<str name="confFiles"></str>
</lst>
<lst name="slave">
<str name="enable">${Project.enable.slave:false}</str>
<str name="masterUrl">/solr/someCoreName</str>
<str name="pollInterval">${Newton.replication.pollInterval:00:02:00}</str>
</requestHandler>
Repeaters are configured to poll every 1 sec. N Slaves are configured to poll at different intervals so as not to overwhelm the repeater with download requests, eg: 2,4,6,8 minutes. Both via java startup command args.
Now, given that I issue optimize on Master index every 2 hours on master, I expect master to make a replicable version available only after optimize. But it seems that, master generation increases after every commit which happens after X
(configurable) minutes and repeater and slaves get the unoptimized (but recent state with latest committed data).
<updateHandler class="solr.DirectUpdateHandler2">
<updateLog>
<str name="dir">some/dir</str>
</updateLog>
<autoCommit>
<maxDocs>10000000</maxDocs>
<maxTime>${Project.autoCommit.maxTime:60000}</maxTime>
<openSearcher>false</openSearcher>
</autoCommit>
</updateHandler>
Repeater/Slave logs after they see Master Generation increment
2019-08-31 14:48:05,544 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Master's generation: 6
2019-08-31 14:48:05,544 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Master's version: 1567288083960
2019-08-31 14:48:05,544 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Slave's generation: 5
2019-08-31 14:48:05,544 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Slave's version: 1567288023785
2019-08-31 14:48:05,544 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Starting replication process
2019-08-31 14:48:05,563 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Number of files in latest index in master: 66
2019-08-31 14:48:05,624 [INFO ][indexFetcher-15-thread-1][solr.update.DefaultSolrCoreState][changeWriter()] - New IndexWriter is ready to be used.
2019-08-31 14:48:05,627 [INFO ][indexFetcher-15-thread-1][solr.handler.IndexFetcher][fetchLatestIndex()] - Starting download (fullCopy=false) to MMapDirectory@/data/<path>/index.20190831144805564 lockFactory=org.apache.lucene.store.NativeFSLockFactory@416c5340
How do I absolutely make sure that I only allow index from Master to Repeaters/Slaves to flow through only after my issued optimize command is complete?
Once I issue optimize, optimized index with 1 segment does flow as expected to Repeaters/Slave but intermediate commits which happen on master also results in Repeaters/Slaves downloading part of the new index making their segment count > 1 and slowing search as seraching on segment size > 1 costs more than searching on segment size 1. I want new index only after periodical (programmed in code) optimize command is issued and not after every commit. I actually removed commit duration on master and then it only incremented its Generation after optimize, but if I remove commit altogether then we are risk of losing uncommitted data between 2 optimize cycles and machines happens to die in between those 2 cycles.
7.7.1
I also tried adding in mergePolicyfactor configs but behaviour is still the same
<mergePolicyFactory class="org.apache.solr.index.TieredMergePolicyFactory">
<int name="maxMergeAtOnce">32</int>
<int name="segmentsPerTier">32</int>
</mergePolicyFactory>
Upvotes: 0
Views: 849
Reputation: 608
Try changing <replicateAfter>commit</replicateAfter>
to <replicateAfter>optimize</replicateAfter>
Also, If it does not work, try removing the polling interval configuration from the slaves.
What you are seeing is expected behaviour for solr and nothing is unusual. Try out the changes and I hope it should work fine.
Upvotes: 0