Reputation: 10555
I know there are many factors those could affect a Java program's running time. I tried to eliminate some of them, including:
System.gc()
is called between any two runs.But I found the data still variates much. Here is a sample when 10 threads are used:
157th run
0 run time: 9106171
1 run time: 9084652
2 run time: 8990820
3 run time: 8989474
5 run time: 9062850
4 run time: 9302010
9 run time: 9454475
8 run time: 9506585
7 run time: 9494990
6 run time: 9491779
total time: 31 ms
158th run
2 run time: 14754858
5 run time: 14865035
0 run time: 15759180
1 run time: 15988056
3 run time: 16660592
8 run time: 16340240
9 run time: 16544479
6 run time: 17280122
7 run time: 17249778
4 run time: 18026322
total time: 19 ms
I found for most runs, they took like 17~20 ms, but for <5% runs, they took like 25~31 ms. More interesting, in the later cases, each thread's running time is even shorter!
The main thread in this program only start()
and join()
the threads, and has no more work to do.
Could anyone provide some thoughts/hints ?
Upvotes: 1
Views: 162
Reputation: 533680
I note you are starting and stopping threads. As the test is very short those thread only have time to be assigned to one (or a small number) of arrangement of cpus. This arrangement could mean more threads are running at once but using hyperthreading, both cpus of a core are used. When this happens, each thread is slowed down (as it has to share a core, cache etc) but the throughput is increased with reduced thread context switches and more efficient use of the core. In another arrangement each core might have only one cpu busy, meaning each thread is faster, but the total run time is longer.
I would try using an ExecutorService, use the same number tasks as you have cpus (or some multiple) and re-use the threads. This will give you more consistency between runs and the OS will have time to place your threads in a more efficient manner.
I have written a library which will allow you to assign threads to different cpu layouts. e.g. sharing or not sharing cores. https://github.com/peter-lawrey/Java-Thread-Affinity
Upvotes: 3
Reputation: 554
Let's say for the 157th run all threads ran on the one core. The bottleneck would simply be them waiting for each other in a queue like fashion. Each individual run was quick because they had sole reign over the memory resources, but in aggregate it was slow, because they all lined up and waited for their turn.
Let's say the 158th run was spread out across multiple cores. All would compete at once for memory / cache resources, so each thread took longer individually. But because they were all working at the same time, the process in general was finished quicker.
To test this hypothesis, reuse a pool of threads between runs, and manually set their affinity one way or the other.
That is just one possible explanation, there are likely other factors, it is all very non-deterministic. For example your operating system might have scheduled some other program at the same time (e.g. cron job).
Upvotes: 1