user9810241
user9810241

Reputation:

CPI in cpu's with pipelining

Consider that in a CPU, the CPI for add instruction is 0.5 (it performs two add instructions in one cycle via pipelining). So when we want to calculate the CPU time for 10 add instructions we multiply 10 * 0.5 * 2 (clock cycle time is 2 Nanoseconds) and everything is all right.

but when there is only a one add instruction according to the formula we multiply 1 * 0.5 * 2 which is not correct. because it takes at least one clock cycle for performing the add instruction.

Upvotes: 1

Views: 610

Answers (1)

Alain Merigot
Alain Merigot

Reputation: 11537

Consider that in a CPU, the CPI for add instruction is 0.5 (it performs two add instructions in one cycle via pipelining)

but when there is only a one add instruction according to the formula we multiply 1 * 0.5 * 2 which is not correct.

You are making a confusion between latency and throughput.

Throughput describes the number of operations that can be performed in a given time. CPI as you use it is a throughput. So, if you say that CPI=0.5, you mean that you can deliver 2 add results per cycles.

Latency is the measure of the time between the start of an instruction (or an operation, a memory read, etc) and its end. Its is independent of throughput and is correlated to the number of stages in your processor.

So if you look at an individual add instruction, its duration is obviously not half a cycle. Pipelines in recent pentium are between 14 and 19 cycles (but older version had pipelines up to twice longer). And the duration of an individual add is ~15 cycles (at best).

So when we want to calculate the CPU time for 10 add instructions we multiply 10 * 0.5 * 2 (clock cycle time is 2 Nanoseconds) and everything is all right.

No. Other factors may have an impact, as dependencies, and in general interactions with other instructions. CPI was introduced as a measure of the average number of cycles per instructions in a complete program. The complexity of present computers is such that it is almost impossible to have an accurate estimation of CPI without executing the program. So, to get it, we time the program and we divide by the number of executed instructions. This will take into account memory accesses, dependencies, branch misprediction, etc, and this is what makes sense for the final user that wants to know what will the speed of its program. Your estimation is a theoretical one that never happens in real life.

Upvotes: 1

Related Questions