Chumak disconnects all clients when one misbehaves

I have several ZeroMQ REQ clients that use the zmq Python module. I have a single server, written in Erlang, that uses chumak to implement REP for these clients.

When one client has a network issue that results in a double req (e.g. the Wi-Fi goes down for a second), the chumak library errors out, crashes, and the server has to restart (this is done automatically) due to the chumak library crashing. Every client has a momentary hickup when this happens (prints a 'disconnected; reconnected' message).

The concern is that chumak crashes when a single client misbehaves (e.g. sends a double req). This is at best annoying, and at worst a DoS condition. Since the error is in the chumak library (pasted below), I wonder if it is unavoidable, or if there is something I can do to mitigate this issue?

=ERROR REPORT==== 6-Jan-2018::18:43:37 ===
** Generic server <0.6905.501> terminating 
** Last message in was {queue_ready,"node3",<0.10180.501>}
** When Server state == {state,chumak_router,
                            {chumak_router,[],
                                {lbs,
                                    #{"node1" => [<0.26044.506>],
                                      "node2" => [<0.21607.505>],
                                      "node3" => [<0.10180.501>],
                                      "node4" => [<0.6607.501>],
                                      "node5" => [<0.6654.501>]},
                                    #{<0.6607.501> => "node4",
                                      <0.6654.501> => "node5",
                                      <0.10180.501> => "node3",
                                      <0.21607.505> => "node2",
                                      <0.26044.506> => "node1"}},
                                {from,
                                    {<0.6901.501>,
                                     #Ref<0.3489020567.326893569.74040>}},
                                {[],[]}},
                            #{}}
** Reason for termination == 
** {{noproc,{gen_server,call,[<0.10180.501>,incomming_queue_out]}},
    [{gen_server,call,2,[{file,"gen_server.erl"},{line,206}]},
     {chumak_router,recv_message,2,
                    [{file,"/opt/system/system_server/src/_build/default/lib/chumak/src/chumak_router.erl"},
                     {line,101}]},
     {chumak_router,queue_ready,3,
                    [{file,"/opt/system/system_server/src/_build/default/lib/chumak/src/chumak_router.erl"},
                     {line,91}]},
     {chumak_socket,queue_ready,3,
                    [{file,"/opt/system/system_server/src/_build/default/lib/chumak/src/chumak_socket.erl"},
                     {line,221}]},
     {gen_server,try_dispatch,4,[{file,"gen_server.erl"},{line,616}]},
     {gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,686}]},
     {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,247}]}]}

Upvotes: 1

Views: 81

Answers (1)

Tianpo Gao
Tianpo Gao

Reputation: 71

I think it's a bug of Chumak. Because the chumak_peer and the chumak_socket run concurrency, If the chumak_peer crashed before gen_server:call(PeerPid, incoming_queue_out), chumak_socket won't behave correctly.

And I have already submitted a patch to the author of Chumak. Please see it here.

Upvotes: 1

Related Questions