Reputation: 21

One ELK Node is Down

I have ELK cluster consists of three nodes elk01, elk02, elk03. One node elk01 is suddenly down. When I check the logs /var/log/elasticsearch/elasticsearch.log of elk01, I found these errors:

```"[elk01] Authentication of [elastic] was terminated by realm [reserved] - failed to authenticate user [elastic]" "[elk01] Authentication of [kibana_system] was terminated by realm [reserved] - failed to authenticate user [kibana_system]"`

I did the following troubleshooting:

I restarted elasticsearch service in all nodes.
I can do telnet on port 9200 and 9300 on all nodes.
I tried to reset elastic password on node elk01 using this command /user/share/elasticsearch/bin/elasticsearch-reset-password -i -u elastic

but I am getting the below error:

Error: Failed to determine the health of the cluster.

I set xpack.security.enabled: false in elasticsearch.yml in all nodes, restart elasticsearch, and tried again the above commands but I can't reset the password either.

From elk02 and elk03 nodes, I can get the indices status using curl http://elk02:9200/_cat/indices?v and all indices have green status.

Note: The cluster was working fine. This issue is suddenly appeared without making any changes in the configuration.

Updated with the content of /var/log/elasticsearch/elasticsearch.log for elk01

2023-09-14T23:10:09,174][INFO ][o.e.c.c.JoinHelper ] [elk01] failed to join {elk03}{lNN-V6I5S3mgyAOL1_taXg}{EM6lZBPcTpK4lqwdIq8udA}{elk03}{xx.xx.xx.224}{xx.xx.xx.224:9300}{cdfhilmrstw}{ml.allocated_processors_double=4.0, ml.machine_memory=16526131200, xpack.installed=true, ml.max_jvm_size=8262778880, ml.allocated_processors=4} with JoinRequest{sourceNode={elk01}{twbA5ovpSP-gAkeD65cmNg}{62VkkaWSR86H7-dlzTn1xg}{elk01}{xx.xx.xx.222}{xx.xx.xx.222:9300}{cdfhilmrstw}{ml.max_jvm_size=1073741824, ml.allocated_processors_double=8.0, xpack.installed=true, ml.machine_memory=16525099008, ml.allocated_processors=8}, minimumTerm=2609, optionalJoin=Optional[Join{term=2609, lastAcceptedTerm=10, lastAcceptedVersion=4237, sourceNode={elk01}{twbA5ovpSP-gAkeD65cmNg}{62VkkaWSR86H7-dlzTn1xg}{elk01}{xx.xx.xx.222}{xx.xx.xx.222:9300}{cdfhilmrstw}{ml.max_jvm_size=1073741824, ml.allocated_processors_double=8.0, xpack.installed=true, ml.machine_memory=16525099008, ml.allocated_processors=8}, targetNode={elk03}{lNN-V6I5S3mgyAOL1_taXg}{EM6lZBPcTpK4lqwdIq8udA}{elk03}{xx.xx.xx.224}{xx.xx.xx.224:9300}{cdfhilmrstw}{ml.allocated_processors_double=4.0, ml.machine_memory=16526131200, xpack.installed=true, ml.max_jvm_size=8262778880, ml.allocated_processors=4}}]} org.elasticsearch.transport.RemoteTransportException: [elk03][xx.xx.xx.224:9300][internal:cluster/coordination/join] Caused by: java.lang.IllegalStateException: index [.monitoring-es-7-2023.09.14/arNavb0JT92WnAajJMCyWQ] version not supported: 8.5.2 the node version is: 8.5.0 at org.elasticsearch.cluster.coordination.JoinTaskExecutor.ensureIndexCompatibility(JoinTaskExecutor.java:268) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.JoinTaskExecutor.lambda$addBuiltInJoinValidators$9(JoinTaskExecutor.java:341) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.Coordinator.lambda$validateJoinRequest$13(Coordinator.java:663) ~[elasticsearch-8.5.0.jar:?] at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?] at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092) ~[?:?] at org.elasticsearch.cluster.coordination.Coordinator.validateJoinRequest(Coordinator.java:663) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.Coordinator$1.onResponse(Coordinator.java:609) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.Coordinator$1.onResponse(Coordinator.java:604) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:31) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.ClusterConnectionManager.lambda$connectToNodeOrRetry$1(ClusterConnectionManager.java:146) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$DelegatingFailureActionListener.onResponse(ActionListener.java:245) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ListenableFuture.notifyListenerDirectly(ListenableFuture.java:113) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:100) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.BaseFuture.set(BaseFuture.java:131) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ListenableFuture.onResponse(ListenableFuture.java:139) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.ClusterConnectionManager.lambda$connectToNodeOrRetry$4(ClusterConnectionManager.java:253) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:162) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$RunAfterActionListener.onResponse(ActionListener.java:367) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$MappedActionListener.onResponse(ActionListener.java:127) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.TransportService.lambda$handshake$6(TransportService.java:560) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$DelegatingFailureActionListener.onResponse(ActionListener.java:245) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:43) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1362) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1362) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:369) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.InboundHandler$2.doRun(InboundHandler.java:361) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:892) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.5.0.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?] at java.lang.Thread.run(Thread.java:1589) ~[?:?]

Upvotes: 0

One ELK Node is Down

Answers (3)

Related Questions