ZeroMQ message patterns

Question

I need to create a distributed system, where I have the following node types:

Client [1-n instances]
Server [1-n instances]
Proxy [1 instance - basically a forwarder to any Server]
Cloud Server [1 instance - basically a forwarder to any Proxy]

[Client] -> [Cloud Server] -> [Proxy -> Server] - a Distributed Setup

[Client -> Server] - a Local Setup

[The Proxy and Server are running on the same node or network]

The Client, once on the same network with the Server, should also be allowed to connect directly on the Server instead of going through the Cloud Server / Proxy.

The Server can have multiple Clients connected to it but it can also publish messages for the Client(s) apart from responding to requests from Client(s). The Server/Cloud Server need to differentiate the clients nodes by id and know at any time whether they are connected / disconnected.

To my understanding, the Server should provide a REQ/REP endpoint in order to allow message exchange with the Proxy / Local Client and also a PUB endpoint, where the Proxy / Local Client will be subscribed for any notifications coming from the Server.

Concerning the Proxy, it looks like I will have to have a two endpoints; one on the inside and two endpoints on the outside. Basically I will have a ROUTER/DEALER endpoints for REQ/REP and XPUB/XSUB endpoints for PUB/SUB notifications targeting remote Clients. But my concern is that the proxy on the outside will always have one node to reply to and only one node subscribed to notifications and this is the Cloud Server.

Concerning the Cloud Server, it looks like I will have something similar to the Proxy I described above, but unlike the Proxy above I see that the ROUTER/DEALER and XPUB/XSUB fill the bill.

Obviously I am new to ZeroMQ and it offers a lot. I would like to focus on what is needed for the time being and I would really appreciate your help.

user3666197 · Accepted Answer

Well, ZeroMQ is a great tool to design & build systems with, but the first thing I would recommend anyone, being a keen young novice, or a seasoned hands-on experienced Computer Science veteran, "Forget to consider all the patterns a Plug-and-Play solution to everything."

Having built "a few" distributed, low-latency systems, there are many similarities one will, sooner or later, meet in person.

Some of the ZeroMQ's built-in primitives for the Formal Scaleable Communication Patterns have "almost" matching behaviour, but one needs other ordering than an in-built round-robin stepper, some other is "almost" matching, but has some particular sequence-of-steps requirements, which one cannot guarantee in the reality of the worlds of distributed-agents. Simply put, there are many moments, when one feels "almost" done, but some tiny ( or hidden ) detail makes a ( hidden ) call to "Houston, we have a problem..."

How to focus on what is needed?

Forget to think in a classical, sequential manner.

Distributed systems are several orders of magnitude more complex, than a plain, SEQ-tools programmed monolythic system. Besides the principal design targets, there are much more things, that can and will go wrong in production.

Revisit Design-rules and carefully check for new:
1. scaling aspects: define hidden needs - ( nodes, message sizes, traffic peaks )
2. blocking states: define additional signalling needs ( to allow to get out of all potential distributed-state dead-/live-locking )
3. surviveability needs - distributed system will meet lost messages, lost node(s)
4. incoherent protocol versions - for cases where no one can guarantee an enforced unity in distributed systems
5. self-defensive needs - in case a remote node starts some panic/flawed flooding of signalling/messaging channel ( OOP as-is does not provide self-defensive tools, and cannot limit remote-reqestors injected calls, built a set of protective tools for internal self-healing protection in cases, when an objects service is over-consumed or mis-used from external caller, so as to harden your design against such erroneous / malicious mode-of-operations, which a normal, typical OOP-model method typically cannot protect itself from ).

The Best Next Step:

Real-world System Architecture simply must contain more "wires"

If your code strives to go into production state, not to remain as just an academia example, there will have to be much more work to be done, to provide surviveability measures for the hostile real-world production environments.

An absolutely great perspective for doing this and a good read for realistic designs with ZeroMQ is Pieter HINTJEN's book "Code Connected, Vol.1" ( may check my posts on ZeroMQ to find the book's direct pdf-link ).

Plus another good read comes from Martin SUSTRIK, the co-father of ZeroMQ, on low-level truths about the ZeroMQ implementation details & scale-ability

_{Epilogue: As a bonus, theREQ/REP primitive behaviour communication pattern is dangerous per-se in the real-world as it can self-deadlock the communication processes pair in case a transport ( for whatever reason ) has lost a packet and the "pendulum"-style message delivery gets incomplete and locked forever.}

ZeroMQ message patterns

Answers (1)

How to focus on what is needed?

The Best Next Step:

Related Questions