Reputation: 1664
I am in the process of a bulk indexing operation into a solr 5.0 collection with approx 200m documents now. I am noticing that the tlog is building up and is not being deleted, additionally, indexing performance has gotten really slow. I am wondering why the tlog is not being removed. This is what the data directory looks like:
du -sh *
4.0K data
69G index
109G tlog
I've tried multiple variations of:
update?commit=true&expungeDeletes=true&openSearcher=true
I see in the log file that Solr is picking it up, but there are no changes.
The commit settings in solrconfig are:
<autoCommit>
<maxTime>15000</maxTime>
<maxDocs>1500000</maxDocs>
<openSearcher>false</openSearcher>
</autoCommit>
<autoSoftCommit>
<maxTime>900000</maxTime>
<maxDocs>2000000</maxDocs>
</autoSoftCommit>
One thing to keep in mind is that I had soft commit commented out during the indexing process. Also, these values are pretty high because this is relatively index heavy collection, with pretty controlled querying, so the commit strategy is pretty relaxed.
I restarted Solr and naturally it is taking forever to start because it is replaying the tlog, not sure if it will clear this up once fully started. Now, I am under impression that Solr keeps some tlogs around in case it needs to replica the data to another collection, but this is a standalone instance and is not really necessary, additionally, since it is larger than the index folder, I am assuming there are items not commited to the main index yet. Is that right?
Any idea what's happening here?
Upvotes: 2
Views: 4108
Reputation: 1664
So I thought I'd pass along an update, even though it's a bit late.
I restarted Solr instance, naturally it took about 4 hours to start up since tlogs had to be replayed. Then they were purged after a commit.
Upvotes: 3