Reputation: 11
I am having a c code that processes a large amount of data(80MB, U16) in global array. To reduce the time taken, I used pthreads library. The process is multiplying each element with a constant. Using 2 threads, it takes 50ms to process. While using three threads, it takes 120 ms(approx). I also tried increasing the stack memory, but it doesn't work.
There is no rand() function or dynamic allocation of memory in the code. Just calling a simple function in 2 or 3 threads.
I am wondering what is the factor that limits the performance if the number of threads is increased? Also please suggest me how to optimize the execution time further.
P.S: My system has 8GB RAM, Intel i3 processor. Running on Windows(If that helps)
Upvotes: 1
Views: 1470
Reputation: 2080
Lets assume you made an optimal implementation (this might be really hard depending on the problem).
You seperated the blocks without overlap and fed them to the threads.
So far so good.
First of all creating and terminating ( and maybe managing) a thread takes time, which gets added to the computation time allready needed. This might produce overhead that kills your benefit.
But what I think is more important might be that you have an i3 processor, a lot of them just have 2 cores and depending on whether hyperthreading is activated you also have 2 logical cores. And for such systems you can not benefit from more than 2 threads (if they can use all resources), a thrid thread might just get in the way with the other two, creating a longer runtime.
Upvotes: 1
Reputation: 215134
Most often the reason is incorrect benchmarking...
That aside, you have to realize that creating/deleting a thread is a resource-intensive action. It takes time, it needs memory.
Meaning that more threads do not necessarily give faster overall execution of the whole program, but rather they can give faster execution of a specific task. Therefore the use of threads in an application have to be considered on a case-by-case basis.
Upvotes: 0