Reputation: 1626
I wrote a new library called director.
It's a supervisor library.
One of its feature is giving a fun with arity 2 to director, and director will call function for every crash of process, first argument is crash reason and second is crash count, for example:
-module(director_test).
-behaviour(director).
-export([start_link/0, init/1]).
start_link() ->
director:start_link(?MODULE, []).
init([]) ->
ChildSpec = #{id => foo,
start => {m, f, args},
plan => [fun my_plan/2],
count => infinity},
{ok, [ChildSpec]}.
my_plan(normal, Count) when Count rem 10 == 0 ->
%% If process crashed with reason normal after every 10 times
%%, director will restart it after spending 3000 milliseconds.
{restart, 3000};
my_plan(normal, _Count) ->
%% If process crashed with reason normal director will restart its
restart;
my_plan(killed, _Count) ->
%% If process was killed, Director will delete it from its children
delete;
my_plan(Reason, Count) ->
%% For other reasons, director will crash with reason {foo_crashed, Reason}
{stop, {foo_crashed, Reason}}.
I announced my library in Slack and they was wondering about writing new supervisor in this way !
Someone said that "I tend to not let the supervisor handle back-off".
Finally they did not tell me clean information and i think i need to know more about supervisor and its duty, etc.
I think that a supervisor is a process that should understand when to restart which child and when to delete which child and when to not restart which child. Am i right?
Can you tell me some good features of OTP/Supervisor that i have not in Director? (List of director's features)
Upvotes: 2
Views: 242
Reputation: 13164
You are mixing the ideas of supervision and management.
Supervision is already a part of OTP. It is the basic idea that:
Management is something that may or may not be present in your system, so it is left up to you. It is the idea that you would have a single (usually named) process that guides the overall high-level task that your (supervised) workers are doing. Having a manager process gives you a single point of control for the overall effort being done -- which also means it is a single place you can tell that overall effort to start, stop, suspend itself, etc. and this is where you could add additional logic about selective restarts based on some crash condition.
Think of "supervision" as a low-level, system framework type idea. It is always the same in all programs just like opening a file or handling a network socket would be. Think of management as one discrete chunk of the actual problem your program needs to solve to accomplish its work.
Management may or may not be complex. Supervision must always be uniform and simple. Giving a supervisor too much responsibility makes them difficult to understand and debug, and often leads to business problems -- an overloaded supervisor can be a major problem in a system. Don't burden your supervisors with high-level management tasks.
I wrote an article about the "service -> worker pattern" in Erlang a while back. Hopefully it informs more than it confuses: https://zxq9.com/archives/1311
Upvotes: 2
Reputation: 1833
Please do not take this personally. You have asked for a feedback and I'm trying to give it to you.
After quickly looking at the docs and the code, I think the main problems with your library are:
You are introducing some complexity in the area where it's normally not needed. In the vast majority of Erlang programs you don't want to analyse why a process have crashed. Analysing it is prone to errors. So the "normal" solution is just to restart the process. If you introduce any logic at this point, you probably introduce some errors too. Such a program is harder to reason about and the advantages are disputable at least.
You are making an assumption that the exit reason is the reason why the process have exited. This is not necessarily true. The reason could have been propagated from its linked processes. If you wanted to really react on all possible exit reasons, you would have to make a transitive closure on all process exit reasons, all it's children exit reasons, all their children exit reasons etc. And you have to change it whenever any of the components changes which is very bad attitude, very error prone. And the introduced complexity (see 1) explodes very badly.
You introduce some "introspection" logic out of the context where the internal logic should be kept ideally - i.e. there's some knowledge about the internal working of the process used outside of its module - in the director's plan. This breaks encapsulation somewhat. The "normal" supervisor knows just how to start the process, it don't need any more information about the process internals.
Last but not least: you are probably solving a non-existing problem. Instead of developing a whole new solution, you should clearly identify the problems of an existing solution and try to solve them very directly and minimally.
Upvotes: -2