Reputation: 174
First of all, I am aware that there are good (and lightweight) message brokers available like NATS. If this was a job, I'd certainly go with proven solutions, this is more about curiosity and the will to understand.
Let's say I want to build a system like a CRM and I want to base it on microservices so it is easily extensible and can be adapted to workloads. Since microservices should be decoupled. In comes pub-sub. In order for pub-sub to work as intended (decoupling of publisher and subscriber) I need a messaging system. Let's say I want to realize this with node.js (being fully aware that there are a lot quicker ways to get this done).
My "issue" or potentially just cognitive failure is to wrap my head around how to make sure that all subscribers received the message from the subject they subscribed to?
The client/frontend sends an Event Request to the broker. The broker potentially verifies the message and puts it on the intended queue. There are 2 microservices subscribed to this queue. The broker is now just sending the oldest event on the queue with a callback to both microservices.
Wouldn't this cause issues when one of the microservices is significantly slower than the other?
I mean, it should work as long as I don't want to send back acknowledgement messages that indicate the task is done by all subscribers. The client doesn't know how many services were involved with the Event Request so it can't track it. So it needs to be done by the broker.
Does that mean, I need to include that in a message broker? That it keeps track of the subscribed services computing status of a given event?
Upvotes: 2
Views: 1121
Reputation: 4278
As you have found out through your own thinking, it's hard to implement pub-sub/topic pattern using 1 queue, because then that 1 queue would have to keep track of the messages to every subscribers. That's a lot of responsibility for that 1 queue.
Typically, the pub-sub/topic pattern, is implemented using several queues:
The subscriber queue act as a mailbox for a specific address. If you have 5 subscribers then you will have 5 subscriber queues.
The broker will distribute the messages from the incoming queue to each subscriber queue depending on the configurated rate at which the subscriber queues are refilled.
This allows each queue to only deal with 1 specific subscriber and it's much easier to track what the subscriber has consumed with acknowledgements. Also, each subscriber is able to consume the messages in its subscriber queue at his own pace.
Upvotes: 0
Reputation: 174
After more research and a few hours lying awake in bed I came to the conclusion that having multiple subscribers to one subject/topic should be considered bad practice if the publisher wants to receive responses/acknowledgements in order to keep track of the status of the sent request/message/event.
After a few more thoughts I came to the conclusion that multiple subscribing services to the same subject are most likely never necessary - at least in my scenario as long as I design the services properly. The only scenario I could think of was the addition of certain features at a later point in time without touching the already deployed service. This feels like a fix for an unsuitable service design.
Then I thought how I could manage it anyways and came up with 3 approaches.
No further explanation needed I suppose. Don't mind the details with some of the methods, it's just a brainstormed version which is definitely not ideal. It's enough to display the pattern.
Since the Broker keeps track on every Subscriber it always knows (or can easily calculate) the number of responses to expect. It therefore could redirect the response messages to an Aggregator Subject that gets automatically created when a message is being sent/published that needs a response or message of success (think of an update of some customer data - you obviously want to know that the message got through and successfully processed).
Of course, the Aggregator could always be inbetween even if there's just one response coming back. That'd reduce the amount of cases to cover. The Aggregator is basically some sort of proxy. It still adds complexity to the Broker though.
First of all: don't mind the mess with the connections on the right. It works for me as a sketch but is far from tidy.
Every message that is being published is being answered with an acknowledge message by the Broker. That message is being put on the messages individual subject stack. Since the Broker knows how many Subscribers every Subject has, it can send back how many responses a Publisher should expect. Acknowledge messages in general are also usefull to notify Publishers wether their message/event/request was accepted or not as well (think of a authentication and authorization pattern here).
This will work as long as the Publisher always wants a response. If it doesn't messages could stick around for quite a while. A timeout could solve this.
This is very similar to approach 2 with the difference that the transport protocol is being used to inform the Publisher about the status of the sent request and the potential number of responses to expect.
Since most if not all protocols suitable for this kind of topography offer some way of response messages and since those should be used anyway to verify that the message has been successfully sent in the first place, the answer could also contain a payload informing the client not only about the successful transfer but also about how many responses to expect.
I'd say the Aggregator approach is too much overhead and it needs more extra code than just using either the transport protocol or the message system itself. The Aggregator is interesting because the client can be entirely oblivious about the services and is therefore decoupled.
The usage of the message system is interesting for logging purposes as well (potential debugging) and the implementation of Sagas (chains of events).
I do not promote any of these approaches as being best practice. I solely want to answer my own question with the results of my research.
Upvotes: 1