Is using different thread pools for different types of tasks worth the overhead?

Question

I'm designing a class that provides statistical information about groups of Collatz sequences. One of my goals is to be able to process a large number of sequences containing enormous terms (on the scale of hundreds or even thousands of digits) simultaneously, with maximum efficiency.

To this end, I plan on using the best data collection technique for each individual statistic, which means some tasks may be more efficiently dealt with by a ForkJoinPool, others by the standard cached and fixed thread pools provided in Executors. Would the overhead of creating multiple thread pools, or shutting one down and creating another, if I went that route, cost me more than I would save?

Stephen C · Accepted Answer

Would the overhead of creating multiple thread pools, or shutting one down and creating another, if I went that route, cost me more than I would save?

How could we possibly tell you that?

There is definitely an overhead in shutting down and restarting a thread pool. If any kind. Creating threads is not cheap.

However, we have no way of quantifying how much you save by using different kinds of thread pool. If we can't quantify that it is impossible to advise you on whether your strategy will work ... or not.

(But I think that repeatedly shutting down and recreating thread pools would be a bad idea. The performance impact of an idle pool is minimal.)

This "smells" of premature optimization. (It is like trying to tune the engine of a racing car before you have manufactured the engine block!)

My advice would be to (largely¹) forget about performance to start with. For now, focus on getting something that works. Here's what I would do:

Implement the code using the easiest strategy, write test cases, test / debug until it works.
Choose a sample problem or set of problems that is typical of the kind you will be trying to solve
Implement a test harness that allows you to measure the code's performance for the sample problems. (Beware of the standard problems with Java benchmarking ...)
Benchmark your code.
- Is it fast enough? Stop NOW.
- If not, continue.
Implement one of the alternative strategies, and test / debug.
Benchmark the modified code.
- Is it fast enough? Stop NOW.
- Is it clear that it doesn't help?. Abandon it, and try another strategy.
- Can you tweak it? If so, try that.
Go to 5.

Also, it may be worthwhile implementing the different strategies in such a way that you can tune them or switch between them using command line or config file settings.

As a general rule, it is hard to determine a priori how well any complicated algorithm or strategy is going to perform. Generally speaking, there are too many factors to take into account for a theoretical ... or intuitive ... approach to give a reliable prediction. Benchmarking and tuning is the way to go.

^{1 - Obviously, if you know that some technique or algorithm will perform badly, and you have a better alternative that is about the same effort to implement ... do the sensible thing.}

Is using different thread pools for different types of tasks worth the overhead?

Answers (2)

Related Questions