Reputation: 4908
We've been running a production system with a single node for over a year and decided to bump up to having 2 nodes for some resiliency.
I can telnet between machines without issue.
I can issue curl commands between machines.
Have upgraded production to 7.7.1 oss
from 6.8 default.
Have created brand new node also running 7.7.1 oss
master:
curl -XGET 'http://localhost:9200/?pretty'
{
"name" : "elasticsearch-01",
"cluster_name" : "zm-amz-data",
"cluster_uuid" : "EzB5di4pQzm7whY4fkpkbQ",
"version" : {
"number" : "7.7.1",
"build_flavor" : "oss",
"build_type" : "deb",
"build_hash" : "ad56dce891c901a492bb1ee393f12dfff473a423",
"build_date" : "2020-05-28T16:30:01.040088Z",
"build_snapshot" : false,
"lucene_version" : "8.5.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
New Node:
curl -XGET 'http://localhost:9200/?pretty'
{
"name" : "elasticsearch-02",
"cluster_name" : "zm-amz-data",
"cluster_uuid" : "EzB5di4pQzm7whY4fkpkbQ",
"version" : {
"number" : "7.7.1",
"build_flavor" : "oss",
"build_type" : "deb",
"build_hash" : "ad56dce891c901a492bb1ee393f12dfff473a423",
"build_date" : "2020-05-28T16:30:01.040088Z",
"build_snapshot" : false,
"lucene_version" : "8.5.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Still no luck. After 3 days I'm getting close to throwing ES out. The new node sees the 'master'. It joins the cluster. Now the issue is that no data is replicating. Cluster status is red.
curl -XGET 'http://localhost:9200/_cluster/health?pretty'
{
"cluster_name" : "zm-amz-data",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 29,
"active_shards" : 29,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 39,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 42.64705882352941
}
There are no errors on the new node or the master. This is end of the master log.
[2020-06-07T00:07:00,644][INFO ][o.e.c.s.MasterService ] [elasticsearch-01] node-join[{elasticsearch-02}{e_toEmodToGU98qY6MZaWQ}{kiBIItOtRpql0GcThMtkHg}{<new node ip>}{<new node ip>:9300}{dimr} join existing leader], term: 3, version: 154, delta: added {{elasticsearch-02}{e_toEmodToGU98qY6MZaWQ}{kiBIItOtRpql0GcThMtkHg}{<new node ip>}{<new node ip>:9300}{dimr}}
[2020-06-07T00:07:01,131][INFO ][o.e.c.s.ClusterApplierService] [elasticsearch-01] added {{elasticsearch-02}{e_toEmodToGU98qY6MZaWQ}{kiBIItOtRpql0GcThMtkHg}{<new node ip>}{<new node ip>:9300}{dimr}}, term: 3, version: 154, reason: Publication{term=3, version=154}
All I want is to get all our data from the master onto the new node!
As requested:
curl -XGET 'localhost:9200/_cat/nodes?pretty'
<master node IP> 25 25 0 0.29 0.33 0.24 dimr * elasticsearch-01
<new node IP> 24 31 0 0.01 0.00 0.00 dimr - elasticsearch-02
curl -XGET 'localhost:9200/_cat/allocation?pretty'
29 86.1gb 276.7gb 963.5gb 1.2tb 22 <master node IP> <master node IP> elasticsearch-01
0 0b 3.7gb 151.1gb 154.8gb 2 <new node IP> <new node IP> elasticsearch-02
39 UNASSIGNED
curl -XGET 'localhost:9200/_cluster/settings?pretty'
{
"persistent" : {
"archived" : {
"xpack" : {
"monitoring" : {
"collection" : {
"enabled" : "true"
}
}
}
},
"cluster" : {
"routing" : {
"allocation" : {
"enable" : "primaries"
}
}
}
},
"transient" : { }
}
curl -XGET 'localhost:9200/_cat/indices?pretty'
yellow open app_indices_one X73QG8FeR3qbfwvUTgZM4w 5 1 141039497 6841171 85.4gb 85.4gb
yellow open .monitoring-kibana-6-2020.05.31 weQn3afBQ3yQ_gWbB1ZeBA 1 1 8639 0 1.9mb 1.9mb
red open .apm-custom-link Cm4oM-fJRs6o8RH275pshQ 1 1
yellow open .monitoring-kibana-6-2020.06.04 dEsNwfodSQCy4a5FsfVzbQ 1 1 8610 0 1.9mb 1.9mb
red open .kibana_task_manager_2 IsjOQqoWTxSytilnTtAHLw 1 1
yellow open .monitoring-es-6-2020.06.05 vjZgiH6wTmS8nM9uhZ2Z6g 1 1 208537 507 111.8mb 111.8mb
yellow open .monitoring-es-6-2020.06.06 qV2J_qtnQoGFg6C8R-mIOA 1 1 32582 273 18.4mb 18.4mb
yellow open .monitoring-kibana-6-2020.06.03 qQWZQ8XoRxS9rhNDbN-THQ 1 1 8577 0 2.1mb 2.1mb
red open .kibana_task_manager_1 iAwrQJnKSm2N_VU1Q-LgQA 1 1
yellow open .monitoring-kibana-6-2020.06.02 xWal0unTS2qsYxPnOmmhPw 1 1 8639 0 1.9mb 1.9mb
yellow open .monitoring-es-6-2020.06.03 EpT1Ex5eQiKwQNXgmcdpCQ 1 1 206852 312 111.2mb 111.2mb
yellow open .monitoring-es-6-2020.06.04 ASjXiuhkTwuUiuZ0sXQcjw 1 1 215820 320 111mb 111mb
yellow open .monitoring-kibana-6-2020.06.01 5sXzuGv0RSaCgQqXbWBsqA 1 1 8640 0 1.9mb 1.9mb
yellow open .monitoring-es-6-2020.06.01 tKb2GWnURki-guyW4Ssmfw 1 1 190529 222 108.4mb 108.4mb
yellow open .monitoring-es-6-2020.06.02 jl_eM6k5QVCtfikiL2K40A 1 1 199123 380 109mb 109mb
yellow open .monitoring-es-6-2020.05.31 odZ0ENHVT9mhXb4IXRHK7A 1 1 181885 324 103.6mb 103.6mb
yellow open .monitoring-kibana-6-2020.06.06 1ntSQo46TQa_dxG3otXNYQ 1 1 1232 0 427.6kb 427.6kb
yellow open .monitoring-kibana-6-2020.06.05 1rP2S6Z5S0GAbOsKSdupzg 1 1 8639 0 1.9mb 1.9mb
yellow open .kibana_task_manager oB1wUSZXRi-lYcoDr1ifLg 1 1 2 0 6.9kb 6.9kb
yellow open app_indices_two gZ8MZoyITHWIAxFAHmcskQ 5 1 9881 174 32.2mb 32.2mb
red open .apm-agent-configuration tVurV4DbTaiLrR5Rh_sgeA 1 1
yellow open .kibana_2 itghNKqNR9uooWt-KlykDg 1 1 76 2 83.4kb 83.4kb
yellow open .kibana_1 6dPwb_2gSmiSvPXfw3FMJg 1 1 12 1 43kb 43kb
yellow open kibana_sample_data_ecommerce 3KoZDmrMRhGnklR3Tom_xA 1 1 4675 0 4.7mb 4.7mb
red open filebeat-7.7.1-2020.06.06 WqlfkqXzQ0SMZXej7Va-qA 1 1
yellow open app_indices_three 0WrMg_5DQm6zbuHS14qwEA 1 1 2169 0 1.1mb 1.1mb
Update and Solution?
The issue was the _xpack settings that could be seen in the cluster settings call. This was a holdover from when we were on 6.8 default
. XPack is not available on OSS.
Someone on the Elastic Search forum gave me the answer, which I will enter as an answer on this question.
Upvotes: 0
Views: 484
Reputation: 4908
The issue was the _xpack settings that could be seen in the cluster settings call. This was a holdover from when we were on 6.8 default
. XPack is not available on OSS.
Someone on the ElasticSearch forum gave me the way to remove that setting:
To remove XPack from an elasticsearch-oss
installation if you are upgrading from elasticsearch-default
curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
"persistent" : {
"archived.*" : null
}
}
'
No restart was needed for me. Instantly my nodes started synchronizing!
Then the following can be executed to make sure all allocations sync:
curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}
'
Upvotes: 1
Reputation: 2179
you should change cluster.routing.allocation.enable
to "all"
or null
curl -H'content-type: application/json' -XPUT '[master-ip]:9200/_cluster/settings' -d '
{
"persistent":{ "cluster.routing.allocation.enable" : "all"}
}'
also you have 5 red indices and it seems that you have another issue.
first of all check which primary shards of these indices are unassigned (using /_cat/shards
)
then use explain API
to find the problem.
curl -H'content-type: application/json' -XGET '[master-ip]:9200/_cluster/allocation/explain?pretty' -d '
{
"index": "filebeat-7.7.1-2020.06.06",
"shard":[unassigned shard number],
"primary" : true
}'
Upvotes: 0