Reputation: 43

How to detect a network was broken down with ZeroMQ monitor mechanism?

A. Description

I am using ZeroMQ monitor and I find that it works when logic disconnection but does not work when network broken down (unplug my cable).

For example:

I launch client app on an android pad, then launch a server app on my windows laptop. They are connected through a router with cables.
Everything will be OK with monitor if I close or open either client app or server app manually. Namely, the monitor on both sides can receive a 'Connect' or an 'Accept' and a 'Disconnect' event.
But If I unplug cable on the server side, while the client and server are connected and running, the monitors on both sides can not detect the 'Disconnect' event.

Is the monitor designed like this?

If so, are there any solutions to detect network broken down ( a cable unplug event ) except heartbeats?

If not, how to use the ZeroMQ's original monitor mechanism to solve this problem? Can a setTCPKeepAlive() interface be useful?

B. System environment

My scenario is as below.

Client

OS: Android, running on a pad, IDE: Android studio 2.3, lib:jeromq-0.4.3

// Java Code

String monitorAddr = "inproc://client.req";
ZContext ctx = new ZContext();
ZMQ.Socket clientSocket = ctx.createSocket(ZMQ.REQ);
clientSocket.monitor(monitorAddr,ZMQ.EVENT_ALL);

// Then start a montitor thread which is implemented by my own.

Server

OS: Windows 7 ( 64 bit ), running on my laptop, IDE: VS2013, lib: Clrzmq4

// C# Code

const string MonitorEndpoint = "inproc://server.rep";

var ctx = new ZContext();
var serverSocket = new ZSocket(ctx,ZSocketType.REP);
ZError error;

// Create serverSocket pair socket
if (!serverSocket.Monitor(MonitorEndpoint, ZMonitorEvents.AllEvents, out error))
{

    if (error == ZError.ETERM)
          return ;    // Interrupted
    throw new ZException(error);
}

// Create a monitor
ZMonitor _monitor = ZMonitor.Create(ctx, MonitorEndpoint);
_monitor.AllEvents += _monitor_AllEvents;
_monitor.Start();

Upvotes: 3

Answers (2)

foo

Reputation: 2131

Trying to go lower than ZMQ protocol itself and access the TCP connection that specific ZeroMQ sockets use (while others do not) doesn't sound like a good idea; it would required to break encapsulation in multiple classes.

The answer @bazza gave in 2017 was entirely correct at the time.

However, newer versions of ZMQ (specifically ZMTP) include an heartbeat functionality.

Check ZMQ documentation for

socketOpt	Java functions	name	purpose
ZMQ_HEARTBEAT_IVL	get/setHeartbeatLvl()	heartbeat interval	milliseconds between ZMPT PINGs
ZMQ_HEARTBEAT_TIMEOUT	get/setHeartbeatTimeout()	local heartbeat timeout	how long the local socket waits between received packets until it considers the connection timed out
ZMQ_HEARTBEAT_TTL	get/setHeartbeatTtl()	remote heartbeat timeout	if and when remote side shall consider the connection timed out

ZMQ_HEARTBEAT_CONTEXT is still in draft state as of 2022. It is supposed to send an byte[] context with every ping.

Now, by design of ZMQ, quoting from chapter 2 of its documentation,

The network connection itself happens in the background, and ZeroMQ will automatically reconnect if the network connection is broken (e.g., if the peer disappears and then comes back).

Thus, answering your main question, I'd expect the monitor to give you ZMQ_EVENT_CONNECT_RETRIED / ZMQ_EVENT_CONNECTED events after the underlying connection was detected as disrupted.

Upvotes: 1

bazza

Reputation: 8414

AFAIK there is no built in heartbeat within ZeroMQ. I know there was some discussion on the topic within the ZMQ community some years ago, and that discussion may still be going on.

It is comparatively simple to incorporate your own heartbeat messaging in your application's use of ZeroMQ, especially if you use something like Google Protocol Buffers to encode different message types; the heartbeat is just another message.

Doing heartbeats in your application (rather than relying on some inbuilt mechanism) is ultimately more flexible; you can choose the heartbeat rate, you can choose what to do if the heartbeat fails, you can decide when heartbeating is important and not important, etc.

Consider heartbeats within a PUB/SUB pattern; it's a bit difficult for the ZMQ authors to decide on your behalf what connection / disconnection / connection-break events matter to you. And if they do build in a mechanism, but an application developer didn't want it, then it is a waste of bandwidth.

It's far easier for the ZMQ authors to leave that kind of application architectural issue to the application author (that's you!) to deal with.

With your specific example, an unplugged network cable simply looks (so far as any software can determine) like no traffic is flowing; it's the same as the application not sending anything. ZMQ doesn't send anything if the application hasn't sent anything.

If you look at the events that the socket monitor can report on, they're all the consequence of something flowing over the network connection, or something done to the socket by the application.

Upvotes: 2