Reputation: 5376
What Happened:
When I repeatedly query the Elastic Search suddenly stopped working.
I increased the heap size from 1GB to 2GB then to 4GB, but no use.
Current Heap usage is just 20% of allocated 4Gb, but still why ES fails with OOM??
ElasticSearch Logs:
2019-11-11T11:12:16,654][INFO ][o.e.c.r.a.AllocationService] [es-stg] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[.kibana_1][0]] ...])
[2019-11-11T11:12:51,447][INFO ][o.e.c.m.MetaDataIndexTemplateService] [es-stg] adding template [kibana_index_template:.kibana] for index patterns [.kibana]
[2019-11-11T11:13:10,527][INFO ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][71] overhead, spent [418ms] collecting in the last [1s]
[2019-11-11T11:13:16,619][INFO ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][77] overhead, spent [313ms] collecting in the last [1s]
[2019-11-11T11:13:21,187][WARN ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][80] overhead, spent [2.4s] collecting in the last [2.5s]
[2019-11-11T11:13:25,396][WARN ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][83] overhead, spent [2s] collecting in the last [2.1s]
[2019-11-11T11:13:27,983][WARN ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][84] overhead, spent [2.3s] collecting in the last [2.6s]
[2019-11-11T11:13:30,029][WARN ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][85] overhead, spent [2s] collecting in the last [2s]
[2019-11-11T11:13:34,184][WARN ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][86] overhead, spent [4.1s] collecting in the last [4.1s]
[2019-11-11T11:14:31,155][WARN ][o.e.c.InternalClusterInfoService] [es-stg] Failed to update node information for ClusterInfoUpdateJob within 15s timeout
[2019-11-11T11:14:31,172][WARN ][o.e.m.j.JvmGcMonitorService] [es-stg] [gc][87] overhead, spent [18.2s] collecting in the last [18.3s]
[2019-11-11T11:14:31,215][ERROR][o.e.x.m.c.i.IndexStatsCollector] [es-stg] collector [index-stats] timed out when collecting data
[2019-11-11T11:14:31,210][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [es-stg] fatal error in thread [elasticsearch[es-stg][search][T#6]], exiting
java.lang.OutOfMemoryError: Java heap space
Spec:
Ubuntu : 16.04
Memory : 8Gb
JVM memory : 4Gb
Results for:
http://localhost:9200/_cat/allocation
84 4.6gb 55.1gb 22.2gb 77.3gb 71 206.189.140.50 206.189.140.50 es-stg
42 UNASSIGNED
http://localhost:9200/_cat/fielddata?v
id host ip node field size
o_KWnYBuR-aimAl1VUtygA ip ip es-stg shard.node 2.7kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg transaction.type 704b
o_KWnYBuR-aimAl1VUtygA ip ip es-stg transaction.name.keyword 1kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg kibana_stats.kibana.status 2kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg beat.hostname 5.8kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg transaction.result 704b
o_KWnYBuR-aimAl1VUtygA ip ip es-stg kibana_stats.kibana.uuid 2kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg source_node.name 2.7kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg shard.index 12.1kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg shard.state 6.6kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg context.service.agent.name 2.2kb
o_KWnYBuR-aimAl1VUtygA ip ip es-stg source_node.uuid 2.7kb
http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
ip 18 98 2 0.04 0.08 0.06 mdi * es-stg
http://localhost:9200/_cluster/settings
{persistent: {xpack: {monitoring: {collection: {enabled: "true"}}}},transient: { }}
Expected:
Need Elastic search to work without fail.
(Less disk space have anything to do with this problem?)
Upvotes: 3
Views: 9964
Reputation: 4461
A good rule-of-thumb is to ensure you keep the number of shards per node below 20 to 25 per GB heap it has configured Example: A node with a 30GB heap should therefore have a maximum of 600-750 shards Shards should be no larger than 50GB. 25GB is what we target large shards. Keep shard size less than 40% of data node size.
curl localhost:9200/_cat/allocation?v
https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
Add this line to config/elasticsearch.yml
bootstrap.memory_lock: true
https://www.elastic.co/guide/en/elasticsearch/reference/current/_memory_lock_check.html
Upvotes: 2