Reputation:
In clustered environment, I guess basically load balancer will pass request to one of E-nodes. Now, how does each E-node understands which D-nodes to access when particular query is executed? I am bit confused with how the index and cache works under clustered environment.
Upvotes: 1
Views: 43
Reputation: 20414
Let me explain the distinction between E- and D-nodes first.
Any host that participates in a MarkLogic cluster can potentially operate as E or D, or even both.
Whether a host operates as E-node is determined by the fact whether it is in a group with app-servers that are relevant to you, like one that exposes some REST api that you need. So, not just Admin or App-Services, but usually something more specific.
Whether a host operates as D-node is determined by the fact whether it holds any forests of a database that is relevant to you, like one that holds part or all data used by a relevant app-server. So not just Modules or Documents, but usually something more specific.
All hosts in a cluster have a complete copy of the cluster config. MarkLogic will take care of getting data when one host needs data located in a forest on a different host.
So, D-nodes are related to data-storage, and that includes indexes, both on disk and in memory.
E-nodes are used to 'evaluate' incoming requests, hence the 'E'. Some caching happens on D-nodes, but expanded tree caches and such typically reside on E-node, so that they don't need to access other hosts to fetch data.
You normally don't need to worry too much about all this, until you reach a stage where you need to tweak performance, which can be very case specific. It can be useful to ask MarkLogic to help with that, if you are in a position to do so.
Now, with regard to load balancing, that only concerns incoming requests, so is relevant to E-nodes. If all hosts are in one Group (not uncommon), every host can act as E-node. The load balancer will need to know the network IPs or names of those machines to relay traffic. In a virtualized environment you probably want to take it even a step further, and allow automatic scaling up and down. The MarkLogic Query Service is also relevant to this.
HTH!
Upvotes: 2