Reputation: 21
I have ELK cluster consists of three nodes elk01, elk02, elk03. One node elk01 is suddenly down. When I check the logs /var/log/elasticsearch/elasticsearch.log
of elk01, I found these errors:
```"[elk01] Authentication of [elastic] was terminated by realm [reserved] - failed to authenticate user [elastic]"
"[elk01] Authentication of [kibana_system] was terminated by realm [reserved] - failed to authenticate user [kibana_system]"`
I did the following troubleshooting:
but I am getting the below error:
Error: Failed to determine the health of the cluster.
I set xpack.security.enabled: false
in elasticsearch.yml
in all nodes, restart elasticsearch, and tried again the above commands but I can't reset the password either.
From elk02 and elk03 nodes, I can get the indices status using curl http://elk02:9200/_cat/indices?v
and all indices have green status.
Note: The cluster was working fine. This issue is suddenly appeared without making any changes in the configuration.
Updated with the content of /var/log/elasticsearch/elasticsearch.log
for elk01
2023-09-14T23:10:09,174][INFO ][o.e.c.c.JoinHelper ] [elk01] failed to join {elk03}{lNN-V6I5S3mgyAOL1_taXg}{EM6lZBPcTpK4lqwdIq8udA}{elk03}{xx.xx.xx.224}{xx.xx.xx.224:9300}{cdfhilmrstw}{ml.allocated_processors_double=4.0, ml.machine_memory=16526131200, xpack.installed=true, ml.max_jvm_size=8262778880, ml.allocated_processors=4} with JoinRequest{sourceNode={elk01}{twbA5ovpSP-gAkeD65cmNg}{62VkkaWSR86H7-dlzTn1xg}{elk01}{xx.xx.xx.222}{xx.xx.xx.222:9300}{cdfhilmrstw}{ml.max_jvm_size=1073741824, ml.allocated_processors_double=8.0, xpack.installed=true, ml.machine_memory=16525099008, ml.allocated_processors=8}, minimumTerm=2609, optionalJoin=Optional[Join{term=2609, lastAcceptedTerm=10, lastAcceptedVersion=4237, sourceNode={elk01}{twbA5ovpSP-gAkeD65cmNg}{62VkkaWSR86H7-dlzTn1xg}{elk01}{xx.xx.xx.222}{xx.xx.xx.222:9300}{cdfhilmrstw}{ml.max_jvm_size=1073741824, ml.allocated_processors_double=8.0, xpack.installed=true, ml.machine_memory=16525099008, ml.allocated_processors=8}, targetNode={elk03}{lNN-V6I5S3mgyAOL1_taXg}{EM6lZBPcTpK4lqwdIq8udA}{elk03}{xx.xx.xx.224}{xx.xx.xx.224:9300}{cdfhilmrstw}{ml.allocated_processors_double=4.0, ml.machine_memory=16526131200, xpack.installed=true, ml.max_jvm_size=8262778880, ml.allocated_processors=4}}]} org.elasticsearch.transport.RemoteTransportException: [elk03][xx.xx.xx.224:9300][internal:cluster/coordination/join] Caused by: java.lang.IllegalStateException: index [.monitoring-es-7-2023.09.14/arNavb0JT92WnAajJMCyWQ] version not supported: 8.5.2 the node version is: 8.5.0 at org.elasticsearch.cluster.coordination.JoinTaskExecutor.ensureIndexCompatibility(JoinTaskExecutor.java:268) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.JoinTaskExecutor.lambda$addBuiltInJoinValidators$9(JoinTaskExecutor.java:341) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.Coordinator.lambda$validateJoinRequest$13(Coordinator.java:663) ~[elasticsearch-8.5.0.jar:?] at java.util.ArrayList.forEach(ArrayList.java:1511) ~[?:?] at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092) ~[?:?] at org.elasticsearch.cluster.coordination.Coordinator.validateJoinRequest(Coordinator.java:663) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.Coordinator$1.onResponse(Coordinator.java:609) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.cluster.coordination.Coordinator$1.onResponse(Coordinator.java:604) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:31) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.ClusterConnectionManager.lambda$connectToNodeOrRetry$1(ClusterConnectionManager.java:146) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$DelegatingFailureActionListener.onResponse(ActionListener.java:245) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ListenableFuture.notifyListenerDirectly(ListenableFuture.java:113) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ListenableFuture.done(ListenableFuture.java:100) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.BaseFuture.set(BaseFuture.java:131) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ListenableFuture.onResponse(ListenableFuture.java:139) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.ClusterConnectionManager.lambda$connectToNodeOrRetry$4(ClusterConnectionManager.java:253) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:162) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$RunAfterActionListener.onResponse(ActionListener.java:367) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$MappedActionListener.onResponse(ActionListener.java:127) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.TransportService.lambda$handshake$6(TransportService.java:560) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListener$DelegatingFailureActionListener.onResponse(ActionListener.java:245) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.action.ActionListenerResponseHandler.handleResponse(ActionListenerResponseHandler.java:43) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1362) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1362) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.InboundHandler.doHandleResponse(InboundHandler.java:369) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.transport.InboundHandler$2.doRun(InboundHandler.java:361) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:892) ~[elasticsearch-8.5.0.jar:?] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.5.0.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?] at java.lang.Thread.run(Thread.java:1589) ~[?:?]
Upvotes: 0
Views: 210
Reputation: 21
Yes. The problem due to version incompatibility. I upgraded elasticsearch on node elk01 and that solved the problem.
Upvotes: 0
Reputation: 3570
From the logs you shared, the node elk01 can't join the cluster because of an index.
[.monitoring-es-7-2023.09.14/arNavb0JT92WnAajJMCyWQ] version not supported: 8.5.2 the node version is: 8.5.0 at
.monitoring-es-7-2023.09.14
index.Note: if you can't remove the index, you need to reindex.
Upvotes: 0
Reputation: 30163
If you read between the lines, so to speak, this log message says:
[elk01] failed to join {elk03}... Caused by: ... index [.monitoring-es-7-2023.09.14...] version not supported: 8.5.2 the node version is: 8.5.0
What happened was somebody upgraded elk03 (and most likely elk02) to v8.5.2 while leaving elk01 behind at 8.5.0, it has been working ok until a new index was created today by elk03 and you restarted elk01. This index is not compatible with elk01's v8.5.0 and therefore elk01 is not allowed to join the cluster.
Solution: upgrade elk01 to v8.5.2 and ensure that all your nodes have the same version going forward.
Upvotes: 0