Supervisor child vs plain spawn_link

Question

I have a hierarchy of processes called "monitor_node". Each of these monitor_nodes is supervised by one supervisor.

Now, each of these nodes may have a complex inner structure. Meaning, it may (or may not) have some subprocesses that are needed for it to operate properly. Example: process sending keep-alive messages. So far I have been using plain spawn_link to create these "internal" processes.

However, I have realized that spawning them in init function of monitor_node (which is being supervised) sometimes causes this function to fail (and therefore whole supervisor tree fails). My question is: would it be a good solution to attach these internal processes to supervisor tree? I am thinking about changing monitor_node to a supervisor that supervises it's internal processes.

My doubts are:

I would have to supervise quite significant number of very small processes. I am not sure if this is a good practice.
I may not know in advance that given "internal" process is a simple process or has some internal structure (also spawns other processes). If the latter is the case then I probably should attach these "inner-inner" processes to the supervisor tree.

I hope I have not confused you too much. Looking forward for an answer.

EDIT:

A very similar (if not the same) problem is discusses here (3rd post). The solution given is pretty much the same as the one that I GIVE CRAP ANSWERS gave.

I GIVE CRAP ANSWERS · Accepted Answer

Supervisors:

There is a trick here, which includes the use of two supervisors. Your tree goes like:

main_sup -> worker
main_sup -> attached_pool_sup

attached_pool_sup -> workers

main sup is one_for_all, so if the worker or the pool supervisor dies, then the tree is done for and killed off. The pool supervisor is a simple_one_for_one which are suitable for having hundreds or thousands of workers.

Init:

Don't do too much work in your init callback. The supervisor will wait until the init completes and you can set a timeout (which you can increase in your case) if it takes longer than normal.

A trick is to quickly timeout (return with a timeout of 0 from init) and then handle additional setup in the handle_info timeout callback. That way you won't be stopping up the main supervisor. Beware of races here!

Supervisor child vs plain spawn_link

Answers (1)

Supervisors:

Init:

Related Questions