dina
dina

Reputation: 4289

How can I know when it's the last cycle of my process restarted by the supervisor in erlang

I have a simple_one_for_one supervisor which has gen_fsm children. I want each gen_fsm child to send a message only on the last time it terminates. Is there any way to know when is the last cycle?

here's my supervisor:

-module(data_sup).

-behaviour(supervisor).

%% API
-export([start_link/0,create_bot/3]).

%% Supervisor callbacks
-export([init/1]).

%%-compile(export_all).


%%%===================================================================
%%% API functions
%%%===================================================================

start_link() ->
  supervisor:start_link({local, ?MODULE}, ?MODULE, []).

init([]) ->
 RestartStrategy = {simple_one_for_one, 0, 1},
 ChildSpec = {cs_fsm, {cs_fsm, start_link, []},
 permanent, 2000, worker, [cs_fsm]},
 Children = [ChildSpec],
 {ok, {RestartStrategy, Children}}.

create_bot(BotId, CNPJ,Pid) ->
  supervisor:start_child(?MODULE, [BotId, CNPJ, Pid]).

the Pid is the Pid of the process which starts the superviser and gives orders to start the children.

-module(cs_fsm).

-behaviour(gen_fsm).
-compile(export_all).

-define(SERVER, ?MODULE).
-define(TIMEOUT, 5000).

-record(params, {botId, cnpj, executionId, pid}).

%%%===================================================================
%%% API
%%%===================================================================

start_link(BotId, CNPJ, Pid) ->
  io:format("start_link...~n"),
  Params = #params{botId = BotId, cnpj = CNPJ, pid = Pid},
  gen_fsm:start_link(?MODULE, Params, []).


%%%===================================================================
%%% gen_fsm callbacks
%%%===================================================================

init(Params) ->
  io:format("initializing~n"),
  process_flag(trap_exit, true),
  {ok, requesting_execution, Params, 0}.

requesting_execution(timeout,Params) ->
  io:format("erqusting execution"),
  {next_state, finished, Params,?TIMEOUT}.

finished(timeout, Params) ->
  io:format("finished :)~n"),
  {stop, normal, Params}.

terminate(shutdown, _StateName, Params) ->
  Params#params.pid ! {terminated, self(),Params},
  ok;

terminate(_Reason, _StateName, Params) ->
  ok.

my point is that if the process fails in any of the states it should send a message only if it is the last time it is restarted by the supervisor (according to its restart strategy).

If the gen_fsm fails, does it restart from the same state with same state data? If not how can I cause it to happen?

Upvotes: 2

Views: 270

Answers (1)

Greg
Greg

Reputation: 8340

You can add sending the message to the Module:terminate/3 function which is called when one of the StateName functions returns {stop,Reason,NewStateData} to indicate that the gen_fsm should be stopped.

gen_fsm is a finite state machine so you decide how it transitions between states. Something that triggers the last cycle may also set something in the StateData that is passed to Module:StateName/3 so that the function that handles the state knows it's the last cycle. It's hard to give a more specific answer unless you provide some code which we could analyze and comment on.

EDIT after further clarification:

Supervisor doesn't notify its children which time it has restarted them and it also can't notify the child that it's the last restart. This later is simply because it doesn't know that it's going to be the last until the supervisor process actually crashes once more, which the supervisor can't possibly predict. Only after the child crashed supervisor can calculate how many times the child crashed during a period of time and if it is allowed to restart the child once more or if that was the last restart and now it's time for the supervisor to die as well.

However, nothing is stopping the child from registering, e.g. in an ETS table, how many times it has been restarted. But it of course won't help with deducting which restart is the last one.

Edit 2:

When the supervisor restarts the child it starts it from scratch using the standard init function. Any previous state of the child before it crashed is lost.

Please note that a crash is an exceptional situation and it's not always possible to recover the state, because the crash could have corrupted the state. Instead of trying to recover the state or asking supervisor when it's done restarting the child, why not to prevent the crash from happening in the first place? You have two options:

I. Use try/catch to catch any exceptional situations and act accordingly. It's possible to catch any error that would otherwise crash the process and cause supervisor to restart it. You can add try/catch to any entry function inside the gen_fsm process so that any error condition is caught before it crashes the server. See example function 1 or example function 2:

read() ->
    try
        try_home() orelse try_path(?MAIN_CFG) orelse
            begin io:format("Some Error", []) end
    catch
        throw:Term -> {error, Term}
    end.

try_read(Path) ->
    try
        file:consult(Path)
    catch
        error:Error -> {error, Error}
    end.

II. Spawn a new process to handle the job and trap EXIT signals when the process dies. This allows gen_fsm to handle a job asynchronously and handle any errors in a custom way (not necessarily by restarting the process as a supervisor would do). This section titled Error Handling explains how to trap exit signals from child processes. And this is an example of trapping signals in a gen_server. Check the handle_info function that contains a few clauses to trap different types of EXIT messages from children processes.

init([Cfg, Id, Mode]) ->
    process_flag(trap_exit, true),
    (...)


handle_info({'EXIT', _Pid, normal}, State) ->
    {noreply, State};
handle_info({'EXIT', _Pid, noproc}, State) ->
    {noreply, State};
handle_info({'EXIT', Pid, Reason}, State) ->
    log_exit(Pid, Reason),
    check_done(error, Pid, State);
handle_info(_, State) ->
    {noreply, State}.

Upvotes: 3

Related Questions