Reputation: 11
Recall Amdahl’s law on estimating the best possible speedup. Answer the following questions.
You have a program that has 40% of its code parallelized on three processors, and just for this fraction of code, a speedup of 2.3 is achieved. What is the overall speedup?
I'm having trouble understanding the difference between speedup and overall speedup in this question. I know there must be a difference by the way this question is worded.
Upvotes: 1
Views: 788
Reputation: 1
Q : What is the overall speedup?
Best start not with the original and trivial Amdahl's law formula, but by reading a bit more contemporary view, extending the original, where add-on overhead costs are discussed and also an aspect of atomicity-of-split-work was explained.
Your original problem-formulation seems to by-pass there explained sorts of problems with real-world process-orchestration overheads by simply postulating a (net-local)-speedup, where a <PAR>-able
Section-under-Review related implementation add-on overhead costs become "hidden", expressed but by a sort of inefficiency of having three-times more resources for code-stream execution, yet having but a 2.3 x speedup, not 3.0 x, so spending more than a theoretical 1/3 of the time on actually also initial set-up (an add-on overhead-time, not present in a pure-[SERIAL]
code-execution ) + parallel-processing (doing The_useful_work, now on triple the capacity of the code-execution resources) + also terminating and results-collection back (add-on overhead-times, not present in a pure-[SERIAL]
code-execution) into the "main"-code.
"Hiding" these natural cost-of-going into/out-of [PARALLEL]
-code-execution section(s) simplifies the homework, yet a proper understanding of the real-life costs is crucial not to spend way more (on setups and all other add-on overhead costs, that are un-avoidable in real-world) than one would ever receive back (from a wish-to-get many-processors-harnessed split-processing speedup)
|-------> time
|START:
| |DONE: 100% of the code
| | |
|______________________________________<SEQ>______60%_|_40%__________________<PAR>-able__|
o--------------------------------------<SEQ>----------o----------------------<PAR>-able--o CPU_x runs both <SEQ> and <PAR>-able sections of code, in a pure [SERIAL] process-flow orchestration, one after another
| |
| |
|-------> time
|START: |
| | |DONE: 100% of the code :
o--------------------------------------<SEQ>----------o | :
| o---------o .. .. .. .. ..CPU_1 runs <PAR>'d code
| o---------o .. .. .. .. ..CPU_2 runs <PAR>'d code
| o---------o .. .. .. .. ..CPU_3 runs <PAR>'d code
| | |
| | |
| <_not_1/3_> just ~ 2.3x faster (not 3x) perhaps reflects real-costs (penalisations) of new, add-on, process-organisation related setup + termination overheads
|______________________________________<SEQ>______60%_|_________|~ 40% / 2.3x ~ 17.39% i.e. the <PAR>-section has gained a local ( "net"-section ) speedup of 2.3x instead of 3.0x, achievable on 3-CPU-code-execution streams
| | |
Net overall speedup ( if no other process-organisation releated add-on overhead costs were accrued ) is:
( 60% + ( 40% / 1.0 ) )
---------------------------- ~ 1.2921 x
( 60% + ( 40% / 2.3 ) )
Upvotes: 1