billcyz
billcyz

Reputation: 1389

Can't start process when starting erlag node

I was trying to start a supervisor and a gen_server process while one node is created from the command-line, I use the following command to start the node:

erl -name [email protected] -s ets_sup start [email protected] calc

However, I found the process is undefined when I use whereis to check the process on the newly created node. I got no problems running ets_sup and ets_srv on the node shell directly, but starting the node from command line doesn't work. I want to know why this happening?

ets_sup.erl:

-module(ets_sup).
-behaviour(supervisor).
-export([start/1, start_link/1, init/1]).

start([A, B]) ->
        start_link([A, B]).


start_link([A, B]) ->
        supervisor:start_link({local, ?MODULE}, ?MODULE, [A, B]).

init([A, B]) ->
        {ok, {{one_for_all, 0, 1},
                [{ets_srv, {ets_srv, start_link, [A, B]}, permanent, 5000, worker, [ets_srv]}]}}.

ets_srv.erl:

-module(ets_srv).
-behaviour(gen_server).
-compile(export_all).

-record(state, {a, b}).

start_link(A,B) ->
        gen_server:start_link({local, ?MODULE}, ?MODULE, [A, B], []).

init([A, B]) ->
        {ok, #state{a = A, b = B}}.

check_sys() ->
        gen_server:call(?MODULE, check_sys).

handle_call(check_sys, _From, #state{a = A, b = B} = State) ->
        {reply, {A, B}, State}.

handle_info(_Info, State) -> {noreply, State}.

handle_cast(_Req, State) -> {noreply, State}.

code_change(_Ol, State, _Ex) -> {ok, State}.

terminate(_R, _S) -> ok.

Upvotes: 1

Views: 68

Answers (1)

Pascal
Pascal

Reputation: 14042

I think that you really start the function ets_sup:start/1 with your parameter list. You can verify it by adding an io:format(...) before the ets_sup:start_link/1.

but the process who executes the function ets_sup:start/1 is an early one who dies very soon with reason shutdown, and as your supervisor is linked to it, it dies too, with all its children.

You have to call this function from a process that will not die (usually, it is the role of the application manager). For example, do:

start([A, B]) ->
    % spawn a new process
    spawn(fun () -> 
               start_link([A, B]),
               % add a loop to keep it alive
               loop()
               end).


loop() ->
    receive
        stop -> ok;
        _ -> loop()
    end.

Edit, but not an answer

I have modified your code:

  • add process_flag(trap_exit, true), in the init server, in order to catch an exit message,
  • add io:format("server terminate with reason ~p, process ~p~n",[_R,self()]), in the server terminate function in order to print an exit reason eventually sent by the supervisor (Note: if an exit message is sent by another process, handle_info will be triggered).
  • add ets_srv:check_sys() in the supervisor, just after the start of the server in order to check if it did star correctly.

Here is the modified code.

-module(ets_sup).
-behaviour(supervisor).
-export([start/1, start_link/1, init/1]).

start([A, B]) ->
    start_link([A, B]).


start_link([A, B]) ->
    supervisor:start_link({local, ?MODULE}, ?MODULE, [A, B]),
    ets_srv:check_sys().

init([A, B]) ->
    {ok, {{one_for_all, 0, 1},
            [{ets_srv, {ets_srv, start_link, [A, B]}, permanent, 5000, worker, [ets_srv]}]}}.

-module(ets_srv).
-behaviour(gen_server).
-compile(export_all).

-record(state, {a, b}).

start_link(A,B) ->
    gen_server:start_link({local, ?MODULE}, ?MODULE, [A, B], []).

init([A, B]) ->
    process_flag(trap_exit, true),
    {ok, #state{a = A, b = B}}.

check_sys() ->
    gen_server:call(?MODULE, check_sys).

handle_call(check_sys, _From, #state{a = A, b = B} = State) ->
    io:format("check_sys state ~p, process ~p~n",[State,self()]),
    {reply, {A, B}, State}.

handle_info(_Info, State) ->
    {noreply, State}.

handle_cast(_Req, State) ->
    {noreply, State}.

code_change(_Ol, State, _Ex) ->
    {ok, State}.

terminate(_R, _S) ->
    io:format("server terminate with reason ~p, process ~p~n",[_R,self()]),
    ok.

Running this version shows that the supervisor starts the server correctly, and then it sends a shutdown message to it. This does not occurs if the supervisor is started in the shell.

C:\src>erl -s ets_sup start [email protected] calc
check_sys state {state,'[email protected]',calc}, process <0.56.0>
server terminate with reason shutdown, process <0.56.0>
Eshell V8.2  (abort with ^G)
1> whereis(ets_srv).
undefined
2> ets_sup:start(['[email protected]',calc]).
check_sys state {state,'[email protected]',calc}, process <0.61.0>
{'[email protected]',calc}
3> whereis(ets_srv).
<0.61.0>
4> ets_srv:check_sys().
check_sys state {state,'[email protected]',calc}, process <0.61.0>
{'[email protected]',calc}
5> exit(whereis(ets_srv),shutdown).
true
6> whereis(ets_srv).
<0.61.0>
7> exit(whereis(ets_srv),kill).
** exception exit: shutdown
8> whereis(ets_srv).
undefined
9>

I have verified that if you start an ordinary process (not a supervisor) the same way, using spawn_link, it doesn't receive any exit message.

-module (st).

-compile([export_all]).

start(Arg) ->
    do_start(Arg).

do_start(Arg) ->
    io:format("spawn from ~p~n",[self()]),
    register(?MODULE,spawn_link(fun () -> init(Arg) end)).

init(Arg) ->
    io:format("init with ~p in ~p~n",[Arg,self()]),
    process_flag(trap_exit, true),
    Pid = self(),
    spawn(fun() -> monitor(process,Pid), receive M -> io:format("loop received ~p~n",[M]) end end),
    loop(Arg).

loop(Arg) ->
    receive
        state ->
            io:format("state is ~p~n",[Arg]),
            loop(Arg);
        stop ->
            io:format("stopping~n");
        _ ->
            loop(Arg)
    end.

The execution gives:

C:\src>erl -s st start [email protected] calc
spawn from <0.3.0>
init with ['[email protected]',calc] in <0.55.0>
Eshell V8.2  (abort with ^G)
1> whereis(st).
<0.55.0>
2> exit(whereis(st),shutdown).
true
3> whereis(st).
<0.55.0>
4> st ! state.
state is ['[email protected]',calc]
state
5> st ! stop.
stopping
loop received {'DOWN',#Ref<0.0.4.66>,process,<0.55.0>,normal}
stop
6> whereis(st).
undefined
7>

Edit, another way to "detach" the supervisor

All the tests I have done show that the supervisor receives a shutdown message. I don't know why, usually I use the application mechanism to start a supervision tree, and I never met this situation.

I propose you to unlink the supervisor from its parent, so it will not receive a shutdown message:

-module(ets_sup).
-behaviour(supervisor).
-export([start/1, start_link/1, init/1]).

start([A, B]) ->
    start_link([A, B]).


start_link([A, B]) ->
    supervisor:start_link({local, ?MODULE}, ?MODULE, [A, B]),
    ets_srv:check_sys().

init([A, B]) ->
    {links,[Parent]} = process_info(self(),links),
    unlink(Parent),
    {ok, {{one_for_all, 0, 1},
            [{ets_srv, {ets_srv, start_link, [A, B]}, permanent, 5000, worker, [ets_srv]}]}}.

An now it works:

C:\src>erl -s ets_sup start [email protected] calc
check_sys state {state,'[email protected]',calc}, process <0.56.0>
Eshell V8.2  (abort with ^G)
1> ets_srv:check_sys().
check_sys state {state,'[email protected]',calc}, process <0.56.0>
{'[email protected]',calc}
2>

Upvotes: 2

Related Questions