Lasit Pant
Lasit Pant

Reputation: 317

Elasticsearch : "failed to get node info for {IP}" and "noNodeAvailableException" in service log

I am facing an issue which i wasn't earlier.

I am attaching logs of my service and elasticSearch (2.4.4):

2020-05-30 06:29:44.576  INFO 24787 --- [generic][T#287]] org.elasticsearch.client.transport       : [Shatter] failed to get node info for {#transport#-1}{172.17.0.1}{172.17.0.1:9300}, disc
onnecting...

org.elasticsearch.transport.ReceiveTimeoutTransportException: [][172.17.0.1:9300][cluster:monitor/nodes/liveness] request_id [10242] timed out after [5000ms]
        at org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:698) ~[elasticsearch-2.4.4.jar!/:2.4.4]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_242]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_242]

ElasticSearch Logs:

[2020-05-30 06:29:46,784][INFO ][monitor.jvm              ] [Tempo] [gc][old][230125][41498] duration [8.2s], collections [1]/[9s], total [8.2s]/[10.7h], memory [473.2mb]->[426.1mb]/[494.9mb], all_pools {[young] [131.8mb]->[84.7mb]/[136.5mb]}{[survivor] [0b]->[0b]/[17mb]}{[old] [341.3mb]->[341.3mb]/[341.3mb]}
[2020-05-30 06:33:47,782][INFO ][monitor.jvm              ] [Tempo] [gc][old][230340][41540] duration [7s], collections [1]/[7.8s], total [7s]/[10.7h], memory [493.3mb]->[425mb]/[494.9mb], all_pools {[young] [136.5mb]->[83.6mb]/[136.5mb]}{[survivor] [15.4mb]->[0b]/[17mb]}{[old] [341.3mb]->[341.3mb]/[341.3mb]}
[2020-05-30 06:37:59,384][INFO ][monitor.jvm              ] [Tempo] [gc][old][230569][41582] duration [6.9s], collections [1]/[7.2s], total [6.9s]/[10.7h], memory [494.8mb]->[424.7mb]/[494.9mb], all_pools {[young] [136.5mb]->[83.4mb]/[136.5mb]}{[survivor] [16.9mb]->[0b]/[17mb]}{[old] [341.3mb]->[341.3mb]/[341.3mb]}

i am not facing the issue in my Development environment however when i deploy on EC2 i am getting this. Adding further when i do a restart of elastic. It works absolutely fine with no issues but after 10-15 mins or less depending on the amount for search queries or insertion queries, the error message appears.

Also, my storage space on the instance is 74% consumed 94G out of 120G. can it be because of memory ? I am pretty much sure my res-client code is fine as its working in production now for a long time. Can it be a Port issue ? I am using docker container for elastic.

Any help will be appreciated.

_cat/fielddata?v enter image description here

_cat/nodes?v

Upvotes: 1

Views: 943

Answers (1)

hamid bayat
hamid bayat

Reputation: 2179

I think your heap size for elasticsearch is very low. my best guess is with increasing the heap size the problem will be solved. To ask why this has happened now, I think it's because the volume of data has increased over time.

my second guess is about high load. It seems that you have too many request to elasticsearch recently. you can check the size of queue request via /_cat/thread_pool?v. you have two solution for this situation. first decrease the request. second add a node and add replica.

Upvotes: 1

Related Questions