sacha barber
sacha barber

Reputation: 2333

NServiceBus clustered workers / Distributor usage

Current Setup

Current Setup

We have a UI (well more than 1 UI, but that is not relevant), and we have 2 load balanced app servers. Such the UI will talk to an alias, behind which are the 2 load balancer app servers. The app servers are also self hosting NServiceBus endpoints. The app server (this could be either App Server 1 or App Server 2 ) that is dealing with the current request is capable of doing the following using the self hosted NServiceBus:

The "App Server(s)" current App.Config

As such the App.Config for each app server has something like this

  <UnicastBusConfig ForwardReceivedMessagesTo="audit">
    <MessageEndpointMappings>
      
      <add Assembly="Messages" Type="PublisherCommand" Endpoint="Publisher" />
      <add Assembly="Messages" Type=" Worker1Command" Endpoint="Worker1" />
      <add Assembly="Messages" Type=" Worker2Command" Endpoint="Worker2" />
      <!-- This one is sent locally only -->
      <add Assembly=" Messages" Type="RunCalculationCommand" Endpoint="Dealing" />
    </MessageEndpointMappings>
  </UnicastBusConfig>

The “Publisher” current App.Config

Currently the “Publisher” App.Config

<UnicastBusConfig ForwardReceivedMessagesTo="audit">
  <MessageEndpointMappings>
  </MessageEndpointMappings>
</UnicastBusConfig>

The “Worker(s)” current App.Config

Currently the worker App.Configs at the moment only have to subscribe to one other endpoint the “Publisher”, their config files looks like this:

<UnicastBusConfig ForwardReceivedMessagesTo="audit">
  <MessageEndpointMappings>
    <add Assembly="Messages" Type="SomeEvent" Endpoint="Publisher" />
  </MessageEndpointMappings>
</UnicastBusConfig>

All other messages to the workers right now come directly from one of the app servers, as shown in the App.Config above for the app servers.

This is all working correctly.

Thing is we have a single point of failure, if the “Ancillary Services Box” dies, we are stuffed.

So we are wondering if we could make use of multiple “Ancillary Services Boxes (each with a Publishers/Worker1/Worker2)”. Ideally they would work exactly as described above, and as shown in the diagram above. Where if “Ancillary Services Box 1” is available it is used, otherwise we use “Ancillary Services Box 2”

I have read about the distributor (but not used it), which if I have it correct, we may be able to use in either the AppServer(s) themselves, where we treat each AppServer as a Distributor and a worker (for the case where we need to do the SendLocal command (RunCalculationCommand) we need to run).

Where the “Ancillary Services Box” would have to use the Distributor for each of the contained endpoints:

So we may end up with something like this:

What we may want to do

Could someone help me to know if I am even thinking about this the right way, or whether I am way off.

Essentially what I want to know is:

Upvotes: 2

Views: 587

Answers (1)

janovesk
janovesk

Reputation: 1138

The distributor is a good approach here, but it comes at a cost of increased infrastructure complexity. To avoid introducing another single point of failure, the distributor and it's queues must be run on a Windows Failover Cluser. Meaning both MSMQ and DTC must be configured as clustered services. This can be oh so much fun.. :D

I've renamed what you call "worker" to endpoints, from Worker1 to Endpoint1 and Worker2 to Endpoint2. This is because "worker" is very clearly defined as something specific when you introduce the distributor. An actual physical endpoint on a machine that is receiving messages from a distributor is a worker. So Endpoint1@ServicesMachine01, Endpoint2@ServicesMachine02 etc. are all workers. Workers get work from the distributor.

Scenario 01

Command only scenario

In the first scenario you see the app server gets a request from the load balancer and sends it to Endpoint1@Cluster01 or Endpoint2@Cluster01 queue on the distributor, depending on the command. The distributor then finds a ready worker for message in that queue and send the command along to it. So for WorkerCommand1 EITHER Endpoint1@ServicesBox01 OR Endpoint1@ServicesBox02 ends up getting the command from the distributor and process it as normal.

Scenario 02

Command and event scenario

In scenario two it's pretty much the same. The PublishCommand is sent to Endpoint3@Cluster01. It picks one of the ready Endpoint3s, in this case Endpoint3@ServicesBox02, and gives it the command. ServiceBox02 processes the message and publishes the SomeEvent to Endpoint01@Cluster01 and Endpoint02@Cluster01. These are picked up by the distributor and in this case sent to Endpoint1@ServiceBox01 and Endpoint2@ServiceBoxN.

Notice how the messages ALWAYS flow THROUGH the distributor and the queues on Cluster01. This is actual load balancing of MSMQ.

Config for app server changes to makes sure the commands go through the cluster.

<UnicastBusConfig ForwardReceivedMessagesTo="audit">
    <MessageEndpointMappings>

      <add Assembly="Messages" Type="PublisherCommand" Endpoint="Endpoint3@Cluster01" />
      <add Assembly="Messages" Type="Worker1Command" Endpoint="Endpoint1@Cluster01" />
      <add Assembly="Messages" Type="Worker2Command" Endpoint="Endpoint2@Cluster01" />
      <!-- This one is sent locally only -->
      <add Assembly=" Messages" Type="RunCalculationCommand" Endpoint="Dealing" />
    </MessageEndpointMappings>
  </UnicastBusConfig>

ServicesBox config changes slightly to make sure subscriptions go through the distributor as well.

<UnicastBusConfig ForwardReceivedMessagesTo="audit">
  <MessageEndpointMappings>
    <add Assembly="Messages" Type="SomeEvent" Endpoint="Endpoint3@Cluster01" />
  </MessageEndpointMappings>
</UnicastBusConfig>

No changes for the publisher config. It doesn't need to point to anything. The subscribers will tell it where to publish.

Upvotes: 5

Related Questions