Reputation: 1359
We are using java 7 and working on multithreaded data crunching application. Due to certain constraint we are not using spark or any other map-reduce way to solve this problem. The idea of this project is maximize the performance of application using multi-threading.
My understanding is that at any given point, considering the CPU is not running any other thing apart from OS, number of the thread working simultaneously will be equal to number of hyper threading that CPU provides. But there is java GC which will kick-in every now and then. We have to consider that as well.
Also, I am aware that if I create more threads then I will actually degrade the performance because of the time spent in context switching.
The question is what would be the best way to consider all these things and create appropriate number of threads. Any idea or thought process? Is there any other process that I should consider?
Upvotes: 1
Views: 939
Reputation: 533530
The question is what would be the best way to consider all these things and create appropriate number of threads
I would use Java 8 which does this for you. e.g.
Results result = listOfWork.parallelStream()
.map(t -> t.doWork())
.collect(Collectors.reduce(.....));
However if you are stuck on Java 7, you can use an ExecutorService.
int procs = Runtime.getRuntime().availableProcessors();
ExecutorService es = Executors.newFixedThreadPool(procs);
But there is java GC which will kick-in every now and then
Unless you are using CMS, it doesn't kick in at the same time, so it doesn't matter what these threads are doing (in terms of tuning your thread pool)
Is there any other process that I should consider?
If you have other processes on the machines which use the CPU a lot you should consider them.
Upvotes: 3
Reputation: 1112
I actually did research on this last semester. When using threads, a good rule of thumb for increased performance for CPU bound processes is to use an equal number of threads as cores, except in the case of a hyper-threaded system in which case one should use twice as many cores. The other rule of thumb that can be concluded is for I/O bound processes. This rule is to quadruple the number threads per core, except for the case of a hyper-threaded system than one can quadruple the number of threads per core.
Upvotes: 0