Hyperboreus
Hyperboreus

Reputation: 32429

Erlang process dies after disconnect

I got the following setup:

When I start Alice (alice:start/0) on [email protected] it spawns linked Bob (bob:start/1) on gca.local. Both processing are trapping exits.

When Alice dies of something, Bob gets notified and keeps on running. When Bob dies of something, Alice gets notified and keeps on running.

When I cut the network connection, Alice gets notified that Bob has died of noconnection and process bob dies on [email protected].

I do not want this to happen. I want Bob to keep on running although it looses connection to Alice.

My questions are:


Here goes the code:

-module (alice).
-compile (export_all).

start () ->
    register (alice, spawn (fun init/0) ).

stop () ->
    whereis (alice) ! stop.

init () ->
    process_flag (trap_exit, true),
    Bob = spawn_link ('[email protected]', bob, start, [self () ] ),
    loop (Bob).

loop (Bob) ->
    receive
        stop -> ok;
        {'EXIT', Bob, Reason} ->
            io:format ("Bob died of ~p.~n", [Reason] ),
            loop (Bob);
        Msg ->
            io:format ("Alice received ~p.~n", [Msg] ),
            loop (Bob)
    end.

-module (bob).
-compile (export_all).

start (Alice) ->
    process_flag (trap_exit, true),
    register (bob, self () ),
    loop (Alice).

loop (Alice) ->
    receive
        stop -> ok;
        {'EXIT', Alice, Reason} ->
            io:format ("Alice died of ~p.~n", [Reason] ),
            loop (Alice);
        Msg ->
            io:format ("Bob received ~p.~n", [Msg] ),
            loop (Alice)
    after 5000 ->
        Alice ! "Hi, this Bob",
        loop (Alice)
    end.

Upvotes: 0

Views: 365

Answers (2)

hdima
hdima

Reputation: 3637

The problem is io:format/2 call on line 13 of bob.erl. When new process is created in spawn_link('[email protected]',... it inherit the group leader of alice process which is a process local to [email protected] so you will see all output from bob on [email protected] terminal. When [email protected] is disconnected bob handles EXIT message on line 12 of bob.erl but io:format/2 call on line 13 is failed because group leader was disconnected.

The quick fix is to change all bob's io:format/2 calls to io:format(user, Format, Data). In this case all bob's output will be displayed on [email protected] terminal.

However in real projects you really should use gen_server behavior because it handles many rough cases, especially for inter-node communication (don't forget to look at the code). Moreover you really need to use monitor/2 or/and monitor_node/2 instead of link and trap_exit here.

Upvotes: 2

Dustin
Dustin

Reputation: 90960

Whenever I see a trap_exit in code, I assume someone's reinventing some part of OTP. That seems to be the case here.

Take a look the distributed applications documentation. This does what you want in just configuration.

I've used it with pretty good amounts of success for about 7 years now (currently between an atom box and an arm5 box).

Upvotes: -1

Related Questions