facebook-1389780026
facebook-1389780026

Reputation: 71

Program Performance in java fluctuates with thread variation

The title I admit is a bit misleading but I am sort of confused why this happens.

I've written a program in java that takes an argument x that instantiates x number of threads to do the work of the program. The machine i'm running it on has 8 cores / can handle 32 threads in parallel (each core has 4 hyperthreads). When I run the program past 8 threads (i.e. 22), I notice that if I run it with an even amount of threads, the program runs faster as opposed to when I run it with 23 threads (which is actually slower). The performance difference is about 10% between the two. Why would this be? Thread overhead doesn't really take this into account and I feel that as long as im running <32 threads, it should only be faster as I increase the # of threads.

To give you an idea what the program is doing, the program is taking a 1000 * 1000 array and each thread is assigned a portion of that array to update (roundoffs/leftovers in uneven are given to the last thread instantiated).

Is there any good reason for the odd/even thread performance difference?

Upvotes: 1

Views: 177

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77505

Two reasons I can imagine:

  1. The need to synchronize the memory access of your cores/threads. This will eventually invalidate CPU core caches and such things, which brings down performance. Try giving them really disjoint tasks, don't let them work on the same array. See: the memory isn't managed in individual bytes.

  2. Hyperthreading CPUs often don't have full performance. They may for example have to share some floating point units. This doesn't mattern when e.g. one thread is integer-math heavy and the other is float-heavy. But having four threads each needing the floating point units means probably waiting, switching contexts, signalling the other thread, switching context back, waiting again...

Just two guesses. For example, you should have given the actual CPU you are using, the partitioning scheme you are, and a more detailed hint about the computational task.

Upvotes: 2

Related Questions