Reputation: 23178
So I have this application that has a process that requires some gen_servers to be alive somewhere else in the cluster.
If they are up it just works, if they are not, my gen_server fails in init with {error,Reason}
, this propagates through my supervisor into my applications start
function.
The problem is that if I return anything other than {ok,Pid} I get a crash report.
My intention here would be to somehow signal that the application couldn't start properly and that all the processes are down and because of that the application should not be considered active, however, I can only choose to return {ok, self()} and see my application listed as active
when it is not, or return {error, Error} and see how it crashes with:
{gen_server,init_it,6},{proc_lib,init_p_do_apply,3}]}},{ancestors,[rtb_sup,<0.134.0>]},
{messages,[]},{links,[<0.135.0>]},{dictionary,[]},{trap_exit,false},{status,running},
{heap_size,377},{stack_size,24},{reductions,255}],[]]:\n"
The problem seems to be bigger than this, basically there is no way to tell to the application framework that the app failed. It may look like one of these things that are handled by let the process die
in erlang, but allow for an {error, }
return value on application:start
seems like a good tradeoff.
Any hints?
Upvotes: 2
Views: 284
Reputation: 4077
Application will crash at any moment, so application's dependence relationship at the start time can not provide helpful dynamic crash information.
Before I have read part of rabbitmq
project source code, it is also a cluster-based program.
I think rabbitmq
has faced your similar question as you said, because cluster need collect related nodes's
application "is live" information and memory water highmark information
and then make decision.
It's solution is
to register the the first main process of the application in the node locally, the name is "rabbit" in the rabbitmq system, you can find it is rabbit.erl
file, and in the function "start/2".
start(normal, []) ->
case erts_version_check() of
ok ->
{ok, SupPid} = rabbit_sup:start_link(),
true = register(rabbit, self()),
print_banner(),
[ok = run_boot_step(Step) || Step <- boot_steps()],
io:format("~nbroker running~n"),
{ok, SupPid};
Error ->
Error
end.
And the other 4 modules, rabbit_node_monitor.erl, rabbit_memory_monitor.erl,
vm_memory_monitor.erl, rabbit_alarm.erl
to use two erlang technique, one is monitor process to get "DOWN" message of the registered process, the other is alarm handler to collect these information.
Upvotes: 1