Reputation: 1339
I'm working with Elixir but I believe this question applies to Erlang as well.
I'm working on a system which might create dozens of thousands of groups processes of same kind. Each group will have 2 workers and a local supervisor of its own. The question is who will supervise the local supervisors ?
I can imagine two strategies
Does either make sense or is there any other way? Any advice is welcome
Upvotes: 3
Views: 430
Reputation: 73
Did you try to measure the performance of a supervisor with thousands of children?
I have had a supervisor with about 24,000 children and it did not even break a sweat (on old hardware).
I am pretty sure if you try it out a supervisor would easily handle hundred of thousands, maybe even millions of children without any problem.
Premature optimisation is the root of all evil.
Upvotes: 1
Reputation: 1626
Use director
with ETS mode and don't worry about number of children.
In ETS mode, you can read some info about children directly from Table too.
Upvotes: 0
Reputation: 1805
"It depends".
"huge list" and "thousands" really are in different realms. Simple iteration is fast on modern machines. Up to high five, low six items I would have no qualms with a system that regularly has to traverse a list this size, and probably over that I wouldn't really care either:
iex(2)> list = Enum.to_list 1..1_000_000; :timer.tc(fn -> Enum.sum list end)
{24497, 500000500000}
(that is 25 ms for the list traversal and some arithmetic - I'm usually happy if a crashed process gets restarted with such small delays)
Of course - at the end of the day you're expected to do your own performance testing, compare the outcomes with the expected local supervisor crash rate, look up your system's requirements, and compare all these figures to come to an answer.
In the meantime, use the simplest thing that can possibly work: a single global supervisor monitoring a flat hierarchy.
Upvotes: 3
Reputation: 121010
The approach one is perfectly efficient. The global supervisor would not need to traverse anything as soon as any subgroup has it’s own local supervisor and the latter it not intended to crash.
When something will happen with the leaf worker, this local supervisor will take care about restarting it, and the global supervisor wouldn’t even know that something wrong happened there down in the tree.
If, OTOH, you expect your local supervisors to be crashed from time to time on purpose, each local supervisor should be supervised with it’s own, say, intermediate supervisor, which will take care of it’s restarts. The global supervisor will in this case manage these intermediate supervisors, and everything will be cool again.
Upvotes: 0