How is ROM latency accounted for in CPU design

Question

I'm trying to design a simple CPU in VHDL for an Altera FPGA. I am however trying to get my head around how to account for latency incurred by the ROM blocks. The ROM blocks themselves can have both the input address and output data clocked or just the input address clocked, giving a 1 or 2 clock cycle latency between data request (setting the address) and getting the data back.

I can understand if the ROM is essentially a massive data mux, doing things like jumps is trivial because you just set the address and by the next clock cycle, the correct instruction will be there! I just don't quite understand how to manage this with latencies between the ROM and the CPU. From what I gather, each instruction needs to know whether to fetch a new instruction, modify the PC - program counter (jump) or stall (keep the PC the same) but surely if there is a latency of 2 cycles, the instruction will need to know for 2 cycles ahead?

How does one go about writing a PC for this kind of system?

For reference, the memory data width will be the same size as the instructions so each memory location stores one instruction.

Martin Zabel · Accepted Answer

On an FPGA, it is almost sufficient to have only the input address register giving a latency of 1 clock cycle. Then you can address the ROM just with the next value of the PC register instead of the current value.

The next value is the value which will be loaded into the register with the next rising (or falling) clock edge. This next value will also be loaded into ROM address register with the same clock edge. Thus, both registers will have the same content and the ROM actually delivers the data at the (new) PC after the clock edge.

The ROM output is stored in the instruction register, if you have more than 2 pipeline stages. In this case, you will automatically have a ROM output register.

How is ROM latency accounted for in CPU design

Answers (1)

Related Questions