Reputation: 6023

Indexing Heavy dataset in Solr

I am trying to index a heavy dataset with 1 particular field really too heavy (using DIH)...

However, As I start, I get Memory warnings and rollbacks (OutOfMemoryError). So, I have learned that we can use -Xmx1024m option with java command to start the solr and allocate more memory to the heap.

My question is, that since this could also become insufficient later, so it the issue related to caching?

here is my cache block in solrconfig:

<filterCache class="solr.FastLRUCache"
                 size="512"
                 initialSize="512"
                 autowarmCount="0"/>

<queryResultCache class="solr.LRUCache"
                     size="512"
                     initialSize="512"
                     autowarmCount="0"/>

<documentCache class="solr.LRUCache"
                   size="512"
                   initialSize="512"
                   autowarmCount="0"/>

I am thinking like maybe I need to turn of the cache for "documentClass". Anyone got a better idea? Or perhaps there is another issue here?

Just to let you know, until I added that very heavy db field for indexing, everything was just fine...

Upvotes: 1

Answers (1)

femtoRgon

Reputation: 33341

It could be because of caching, sure. Hard to say without more information.

However, I would say, no, you should not turn off document caching, please see the documentation on documentCache.

The size for the documentCache should always be greater than <max_results> * <max_concurrent_queries>, to ensure that Solr does not need to refetch a document during a request.

You might be able to scale your cache settings back somewhat, if necessary. Referring back to the documentation above, you could take it's advice about lazy loading your documents.

A better approach might be: You can not store huge datasets in the index. A very typical pattern is to index large datasets, but store them entirely external to the index, and fetch them from whatever external datasource you created when they are really needed.

It's also possible that 1GB of memory is simply not enough to support what you want to do with your SOLR instance with the expanded datasets.

Upvotes: 1

Indexing Heavy dataset in Solr

Answers (1)

Related Questions