OOM issue in Elasticsearch Cluster 1.1.1 Environment

Question

I have an Elasticsearch 1.1.1 cluster with two nodes. With a configured heap of 18G each. (RAM on each node is 32G) Totally we have 6 Shards and one replica for each shard. ES runs on a 64bit JVM on Ubuntu box.

There is only one index in our cluster. Cluster health looks Green. Document count on each node is close to 200Million. Data used on each cluster node is around 150GB. There are no unassigned shards.

System is encountering the OOM issue (java.lang.OutOfMemoryError: Java heap space).

content of elasticsearch.yml

bootstrap.mlockall: true

transport.tcp.compress: true

indices.fielddata.cache.size: 35%
indices.cache.filter.size: 30%
indices.cache.filter.terms.size: 1024mb
indices.memory.index_buffer_size: 25%
indices.fielddata.breaker.limit: 20%

threadpool:
    search:
        type: cached
        size: 100
        queue_size: 1000

It has been noticed that the instances of org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector is occupying most of the heapspace (around 45%)

I am new to ES. Could someone guide (or comment) on situation on the OOM issue, What could be the cause as we have lot of heapspace allocated ?

xeraa · Accepted Answer

To be blunt: You are flogging a dead horse. 1.x is not maintained any more and there are good reasons for that. In the case of OOM: Elasticsearch replaced field data wherever possible with doc values and added more circuit breakers.

What is further complicating the issue is that there is no more documentation for 1.1 on the official docs — only 0.90, 1.3, 1.4,... So at the very least you should upgrade to 1.7 (the latest 1.x release).

Turning to your OOM issue what you could try:

Increase your heap size, decrease the amount of data you are querying, add more nodes, use doc values (on 2.x and up).
And your indices.fielddata.breaker.limit looks fishy to me. I think this config parameter has been renamed to indices.breaker.fielddata.limit in 1.4 and the Elasticsearch Guide states:

In Fielddata Size, we spoke about adding a limit to the size of fielddata, to ensure that old unused fielddata can be evicted. The relationship between indices.fielddata.cache.size and indices.breaker.fielddata.limit is an important one. If the circuit-breaker limit is lower than the cache size, no data will ever be evicted. In order for it to work properly, the circuit breaker limit must be higher than the cache size.

OOM issue in Elasticsearch Cluster 1.1.1 Environment

Answers (1)

Related Questions