smali
smali

Reputation: 4805

Parallel Thread Execution to achieve performance

I am little bit confused in multithreading. Actually we create multiple threads for breaking the main process to subprocess for achieving responsiveness and for removing waiting time.

But Here I got a situation where I have to execute the same task using multiple threads parallel.

And My processor can execute 4 threads parallel and so Will it improve the performance if I create more that 4 threads(10 or more). When I put this question to my colleague he is telling that nothing will happen we are already executing many threads in many other applications like browser threads, kernel threads, etc so he is telling to create multiple threads for the same task.

But if I create more than 4 threads that will execute parallel will not create more context switch and decrease the performance.

Or even though we create multiple thread for executing parallely the will execute one after the other so the performance will be the same.

So what to do in the above situations and are these correct?

edit

  1. 1 thread worked. time to process 120 seconds.
  2. 2 threads worked. time to process is about 60 seconds.
  3. 3 threads created. time to process is about 60 seconds.(not change to the time of 2 threads.)

Is it because, my hardware can only create 2 threads(for being dual)?

software thread=piece of code
Hardware thread=core(processor) for running software thread.

So my CPU support only 2 concurrent threads so if I purchase a AMD CPU which is having 8 cores or 12 cores can I achieve higher performance?

Upvotes: 0

Views: 311

Answers (2)

rparolin
rparolin

Reputation: 508

http://en.wikipedia.org/wiki/Amdahl%27s_law

Amdahl's states in a nutshell that the performance boost you receive from parallel execution is limited by your code that must run sequentially.

Without knowing your problem space here are some general things you should look at:

  • Refactor to eliminate mutex/locks. By definition they force code to run sequentially.
  • Reduce context switch overhead by pinning threads to physical cores. This becomes more complicated when threads must wait for work (ie blocking on IO) but in general you want to keep your core as busy as possible running your program not switching out threads.
  • Unless you absolutely need to use threads and sync primitives try use a task scheduler or parallel algorithms library to parallelize your work. Examples would be Intel TBB, Thrust or Apple's libDispatch.

Upvotes: 0

Daniel
Daniel

Reputation: 1051

Multi-Tasking is pretty complex and performance gains usually depend a lot on the problem itself:

  1. Only a part of the application can be worked in parallel (there is always a first part that splits up the work into multiple tasks). So the first question is: How much of the work can be done in parallel and how much of it needs to be synchronized (in some cases, you can stop here because so little can be done in parallel that the whole work isn't worth it).
  2. Multiple tasks may depend on each other (one task may need the result of another task). These tasks cannot be executed in parallel.
  3. Multiple tasks may work on the same data/resources (read/write situation). Here we need to synchronize access to this data/resources. If all tasks needs write access to the same object during the WHOLE process, then we cannot work in parallel.

Basically this means that without the exact definition of the problem (dependencies between tasks, dependencies on data, amount of parallel tasks, ...) it's very hard to tell how much performance you'll gain by using multiple threads (and if it's really worth it).

Upvotes: 1

Related Questions