Reputation: 21
i noticed that having more than a thread running for some code is much much slower than having one thread, and i have been really pulling my hair to know why,can anyone help?
code explanation : i have ,sometimes, a very large array that i need to process parts of in a parallel way for optimization,each "part" of a row gets looped on and processed on in a specific thread, now i've noticed that if i only have one "part",i.e the whole array and a single worker thread that runs through it is noticeably faster than if i divide the array and process it as separate sub arrays with different threads.
bool m_generate_row_worker(ull t_row_start,ull t_row_end)
{
for(;t_row_start<t_row_end;t_row_start++)
{
m_current_row[t_row_start]=m_singularity_checker(m_previous_row[t_row_start],m_shared_random_row[t_row_start]);
}
return true;
}
...
//code
...
for(unsigned short thread_indx=0;thread_indx<noThreads-1;thread_indx++)
{
m_threads_array[thread_indx]=std::thread(
m_generate_row_worker,this,
thread_indx*(m_parts_per_thread),(thread_indx+1)*(m_parts_per_thread));
}
m_threads_array[noThreads-1]=std::thread(m_generate_row_worker,this,
(noThreads-1)*(m_parts_per_thread),std::max((noThreads)*(m_parts_per_thread),m_blocks_per_row));
//join
for(unsigned short thread_indx=0;thread_indx<noThreads;thread_indx++)
{
m_threads_array[thread_indx].join();
}
//EDIT
inline ull m_singularity_checker(ull t_to_be_ckecked_with,ull
t_to_be_ckecked)
{
return (t_to_be_ckecked & (t_to_be_ckecked_with<<1)
& (t_to_be_ckecked_with>>1) ) | (t_to_be_ckecked_with &
t_to_be_ckecked);
}
Upvotes: 1
Views: 843
Reputation: 238351
why does having more than one thread(parallel processing) in some specific cases degrade performance?
In conclusion: Threads are not a silver bullet that automatically multiplies the performance of your program.
Regarding your program, we cannot count out any of the above potential issues given the excerpt that you have shown.
Some tips on avoiding or finding above issues:
Upvotes: 3
Reputation: 16341
Using threads do not always mean that you will get more work done. For example using 2 threads does not mean you will get a task done in half the time. There is an overhead to setting up the threads and depending on how many cores and OS etc... how much context switching is occurring between threads (saving the thread stack/regs and loading the next one - it all adds up). At some point adding more threads will start to slow your program down since there will be more time spent switching between threads/setting threads up/down then there is work being done. So you may be a victim of this.
If you have 100 very small items (like 1 instruction) of work to do, then 100 threads will be guaranteed to be slower since you now have ("many instructions" + 1) x 100
of work to do. Where the "many instructions" are the work of setting up the threads and clearing them up at the end - and switching between them.
So, you may want to start to profile this for yourself.. How much work is done processing each row and how many threads in total are you setting up?
One very crude, but quick/simple way to start to measure is to just take the time elapsed to processes one row in isolation (e.g. use std::chrono
functions to measure the time at the start of processing one row and then take the time at the end to see total time spent. Then maybe do the same test over the entire table to get an idea how total time.
If you find that a individual row is taking very little time then you may not be getting so much benefit from the threads... You may be better of splitting the table into chunks of work that are equal to the number of cores your CPU has, then start changing the number of threads (+/-) to find the sweet spot. Just making threads based on number of rows is a poor choice - you really want to design it to max out each core (for example).
So if you had 4 cores, maybe start by splitting the work into 4 threads to start with. Then test it with 8 if its better try 16, if its worse try 12....etc...
Also you might get different results on different PCs...
Upvotes: 1