Reputation: 115
I have a cluster that created in the AWS and set up with one host Manually . we are trying to add multiple host in the same cluster. I choose REST Admin API Management(/admin/v1/cluster-config https://docs.marklogic.com/REST/POST/admin/v1/cluster-config) to add the host. I configured the steps accordingly and run the script with out any error(from terminal i verified).the host was added in to the cluster and when i verified the status in the admin page, it was showing as
host status -- A detailed view of this host's status.
This host is down. The following error occured while trying to contact
it:
XDMP-HOSTOFFLINE: Host is offline or not responding
Host marklogic-node2-abcd.org
Online Disconnected
In addition to that my node was not active and completely Disconnected(From UI we cannot be able to see default.xqy page with admin:8001 port).Hence we restarted the node and removed the Config (data volume).
After rebooting the node2, I can see the node2 in the cluster and when i try to access the node2 with host name , it is responding back with http://marklogic-node2-abcd.org:8001/initialize-admin.xqy
This server must now self-install the initial databases and
application servers. Click OK to continue.
Couple of questions i would like to Know :
How to Debug the Script and where can i find the failure details ?
Secondly if my default database or application services were not configured do i need to Delete the host from the cluster and reconfigure ?
how can i write more logs to find out the errors and make my life easy?
Upvotes: 0
Views: 191
Reputation: 3732
this can be very tricky to debug without deep knowledge of aws, linux, networking protocols. and marklogic. i highly recommend starting over using the managed cluster feature, preferably starting with the supplied cloud formation template sample -- you should have that up in 10 minutes ... copy your data over to the new cluster and your good to go,
if you need to debug what you have, start by reading the docs on marklogic on aws/ec2 completely and augment with relevant aws docs, particularly wrt networking, routing, subnets, vpcs and dns. in the end you will most likely still need to rebuild your cluster. the docs have information on where to look for logs, what pitfalls to avoid, in particular highly recommends that it should not be attempted without serious consideration of the consequences --- the first being it's quite difficult to debug.
If you do want to continue down the 'Tripple black diamond slope' --- a starting point is verifying that dns and tcp/ip works perfectly from each node to each other node. and that the marklogic assignee hostname resolves to the same ip as the dns --- on each node --- prior to installing ml for the first time -- your example showed a custom dns -- this is unlikely the actual host name discovered by marklogic in startup ( see above docs) Read, then reread then sleep on it and read again the docs in their entirety -- then practice on safe dev machines a few dozen (or 100) times to learn the signs of a working configuration
bootstrapping a cluster join is more subtle then it may appear... and much much harder to fix if it's gone wrong --- if you want to do this yourself (as a-posed to using the managed cluster feature which does it for you ). definitely start with non-production 'blank' servers and practice/refine until it runs perfectly many many times in a row.
Upvotes: 1