Reputation: 61
This is an exam question (practice exam, not the real one) but I have no idea how to work out the answer:
The computer has a 64-bit CPU with a speed of 2GHz, the instructions perform programmed Input/Output to a device which is attached to a 33MHz bus (32 bits wide). The device registers are memory-mapped in the range e000 to e0ff.
(MOVQ $A,R performs a 64-bit copy from address A to register R.)
Roughly how many CPU cycles would these instructions take to execute?
It's a multiple choice answer and here's the possible options:
So, although I'm looking for the answer, I'm also looking for the reasoning behind the correct answer so that I could answer this correctly by working it out myself if a similar question is in the real exam.
I can't find anything in the text books nor online to help me calculate this.
Upvotes: 0
Views: 3245
Reputation: 37214
For each read:
the CPU decodes the instruction, figures out what it wants, and ends up sending a "read request" out on a bus or link. Let's pretend this costs 1 CPU cycle.
the bus or link the CPU uses probably won't be the 33 MHz bus that the device uses (because a 2 GHz CPU with 33 MHz RAM just isn't plausible). Typically there's a fast local bus (or fast set of links) connecting CPU/s, RAM and other stuff (routers, bridges); then slower bus/es on the other side of that other stuff. Let's say it takes 2 "fast bus cycles" for the CPU's "read request" to be forwarded from the fast bus to the device's 33 MHz bus; and let's also pretend that the "2 fast bus cycles" are equivalent to 4 CPU cycles.
when the "read request" makes it to the device's 33 MHz bus; that bus is 32-bit, but the read request itself will consist of multiple fields - e.g. maybe a "transition type" field and a "32-bit address" field (and maybe a "start/end" for synchronization and maybe a CRC for error detection and..). Let's pretend it takes 2 bus cycles (1 bus cycle for each field) for the "read request" to make it to the device.
after the device receives the "read request" it needs to decode it and figure out what to do. This could take a few more bus cycles (but it really depends on the device itself), but let's pretend it takes 2 cycles.
eventually the device will send back a "read reply". This will probably be a larger packet (e.g. "transaction type" field, "32-bit address" field, then "64-bit data" split into two 32-bit fields). Let's say it takes 4 bus cycles for this to make its way back to the router/bridge on the "fast bus".
at the "fast bus", maybe we can expect the equivalent of 6 CPU cycles for the "read reply" to make it back to the CPU.
once the "read reply" arrives back at the CPU, the CPU might spend 1 more cycle completing the instruction.
If you add all this up, it becomes "1 + 4 + 4 + 1 = 10" CPU cycles plus "2 + 2 + 4 = 8" bus cycles. Now we need to convert bus cycles into CPU cycles - "2 GHz / 33 MHz = 60.606" so we can say that 1 bus cycle is equivalent to 61 CPU cycles (rounding up because you can't really have a fraction of a CPU cycle); and 8 bus cycles is equivalent to 488 CPU cycles.
Now we can plug that in and say that it might take "10 + 488 = 498" CPU cycles for one read to complete. There are 5 reads so it might take 5 times as long, so it might take 2440 CPU cycles.
Now... we made up a bunch of missing information, and "2440 CPU cycles" isn't one of choices. The natural thing to do in this case is to try to guess why the question is nonsense. The missing information implies that various things that matter were ignored by the question, and therefore it's reasonable to assume that the wrong answer that the question thinks is right will be less than any right answer that the question things is wrong (and less than the answer we obtained from filling in missing information). This assumption leads us to "d. Approximately 1200 CPU cycles
".
With an assumed answer we can work backwards. If the assumed answer is 1200 cycles and there's 5 instructions then it'd be 240 CPU cycles per instruction. If 1 bus cycle is equivalent to 61 CPU cycles; then 240 CPU cycles would be roughly equivalent to 4 bus cycles (ignoring time spent decoding the instruction, etc at the CPU, which is likely negligible). Now we can pretend that maybe the "read request" has a 32-bit address and nothing else (no transaction type) and maybe the "read reply" has a 32-bit address and a 64-bit value and nothing else (no transaction type). Of course this is pure nonsense (no router/bridge implies RAM is either 33 MHz or doesn't exist; no transaction type means there's no obvious way to determine if anything on the bus is a request or a reply, or a read or a write, or any other kind of transaction - e.g. IRQ or error code or coherency traffic or ..; no time for device to determine what to do implies the device has found a way for electronics to break the speed of light).
Upvotes: 0
Reputation: 58762
Since registers are 64 bit, you need to transfer 64 bits. The bus is 32 bits wide, so that means 2 transfers. At 33MHz bus speed, that's 2/33MHz ~ 60.6ns. Since the cpu is 2GHz each cycle is 1/2GHz=0.5ns. Thus number of cycles is 60.6ns/0.5ns ~ 120. That's for each of the instructions. If the question means total, then of course it's 5*120 = 600.
Upvotes: 1
Reputation: 26636
I think we can only give you the most naïve answer.
Let's make some assumptions.
Thus, accessing the 5 memory locations would take 10 bus cycles.
The main CPU is clocked at 2000 MHz and the bus at 33 MHz; that is a ratio of ~60:1.
Answer is then ~600 CPU cycles.
Upvotes: 1