Adagio
Adagio

Reputation: 121

Calculating Cycles Per Instruction

From what I understand, to calculate CPI, it's the percentage of the type of instruction multiplied by the number of cycles right? Does the type of machine have any part of this calculation whatsoever?

I have a problem that asks me if a change should be recommended.

Machine 1: 40% R - 5 Cycles, 30% lw - 6 Cycles, 15% sw - 6 Cycles, 15% beq 3 - Cycles, on a 2.5 GHz machine

Machine 2: 40% R - 5 Cycles, 30% lw - 6 Cycles, 15% sw - 6 Cycles, 15% beq 4 - Cycles, on a 2.7 GHz machine

By my calculations, machine 1 has 5.15 CPI while machine 2 has 5.3 CPI. Is it okay to ignore the GHz of the machine and say that the change would not be a good idea or do I have to factor the machine in?

Upvotes: 2

Views: 8319

Answers (2)

old_timer
old_timer

Reputation: 71576

Cycles per instruction a count of cycles. ghz doesnt matter as far as that average goes. But saying that we can see from your numbers that one instruction is more clocks but the processors are a different speed.

So while it takes more cycles to do the same job on the faster processor the speed of the processor DOES compensate for that so it seems clear this is a question about does the processor speed account for the extra clock?

5.15 cycles/instruction / 2.5 (giga) cycles/second, cycles cancels out you get 2.06 seconds/(giga) instruction or (nano) seconds/ instruction

5.30 / 2.7 = 1.96296 (nano) seconds / instruction

The faster one takes a slightly less amount of time so it will run the program faster.

Another way to see this to check the math.

For 100 clock cycles on the slower machine 15% of those are beq. So 15 of the 100 clocks, which is 5 beq instructions. The same 5 beq instructions take 20 clocks on the faster machine so 105 clocks total for the same instructions on the faster machine.

100 cycles at 2.5ghz vs 105 at 2.7ghz

we want the amount of time

hz is cycles / second we want seconds on the top so we want

cycles / (cycles/second) to have cycles cancel out and have seconds on the top

1/2.5 = 0.400 (400 picoseconds) 1/2.7 = 0.370

0.400 * 100 = 40.00 units of time 0.370 * 105 = 38.85 units of time

So despite taking 5 more cycles the processor speed differences is fast enough to compensate.

2.7/2.5 = 1.08
105/100 = 1.05

so 2.5 * 1.05 = 2.625 so a processor 2.625ghz or faster would run that program faster.

Now what were the rules for changing computers, is less time defined as a reason to change computers? What is the definition of better? How much more power does the faster one consume it might take less time but the power consumption might not be linear so it may take more watts despite taking less time. I assume the question is not that detailed, meaning it is vague and it is a poorly written question on its own, so it goes to what the textbook or lecture defined as the threshold for change to the other processor.

Disclaimer, don't blame me if you miss this question on your homework/test.

Outside an academic exercise like this, the real world is full of pipelined processors (not all but most of the folks writing programs are writing programs for) and basically you can't put a number on clock cycles per instruction type in a way that you can do this calculation because of a laundry list of factors. Make sure you understand that, nice exercise, but that specific exercise is difficult and dangerous to attempt on real-world processors. Dangerous in that as hard as you work you may be incorrectly measuring something and jumping to the wrong conclusions and as a result making bad recommendations. At the same time there is very much the reality that faster ghz does improve some percentage of the execution, but another percentage suffers, and is there a net gain or loss. Or a new processor design faster or slower may have features that perform better than an older processor, but not all feature will be better, there is a trade-off and then we get into what "better" means.

Upvotes: 1

Peter Cordes
Peter Cordes

Reputation: 365517

I think the point is to evaluate a design change that makes an instruction take more clocks, but allows you to raise the clock frequency. (i.e. leaning towards a speed-demon design like Pentium 4, instead of brainiac like Apple's A7/A8 ARM cores. http://www.lighterra.com/papers/modernmicroprocessors/)

So you need to calculate instructions per second to see which one will get more work done in the same amount of real time. i.e. (clock/sec) / (clocks/insn) = insn/sec, cancelling out the clocks from the units.

Your CPI calculation looks ok; I didn't check it, but yes a weighted average of the cycles according to the instruction mix.


These numbers are obviously super simplified; any CPU worth building at 2.5GHz would have some kind of branch prediction so the cost of a branch isn't just a 3 or 4 instruction bubble. And taking ~5 cycles per instruction on average is pathetic. (Most pipelined designs aim for at least 1 instruction per clock.)

Caches and superscalar CPUs also lead to complex interactions between instructions depending on whether they depend on earlier results or not.

But this is sort of like what you might do if considering increasing the L1d cache load-use latency by 1 cycle (for example), if that took it off the critical path and let you raise the clock frequency. Or vice versa, tightening up the latency or reducing the number of pipeline stages on something at the cost of reducing frequency.

Upvotes: 3

Related Questions