Reputation: 3458

What are the main steps to write an Instruction Set simulator?

I will work on a project which requires to write a simulator for a specific instruction set(that may not be for a real processor). Preferably this simulator will be something like SPIM simulator for MIPS ISA . It would show the contents of all the registers, memory locations etc, and let me step through instructions.Are there standard set of steps for writing simulators ? Where to start ?

I know Java and C++ and I finished two courses in computer architecture and working in a team of 3.

Upvotes: 4

Answers (3)

old_timer

Reputation: 71526

I would say you need to work on a disassembler first. The simulator is the next step beyond a disassembler. For example an x86 (or other variable word length instruction sets) disassembler would for need to follow the execution of the code. First it has to know based on the opcodes how many bytes are used by that instruction, second is this a branch and if so what kind, conditional or unconditional and react accordingly. The simulator does all this plus it simulates the registers. The disassembler would be for educational purposes, not a polished publish on sourceforge thing. Just enough to get the feel for how to parse the instructions at the bit level, compute jump offsets.

I would start with a simple instruction set like the 12 bit pic or 6502 or something like that (msp430 is another possible candidate). If you go with an existing instruction set for learning how to write the simulator, you can capitalize on the existing toolchains (assemblers and maybe compilers) for that isa. Divide the task in half. For a new instruction set you will need a toolchain at some point or at least a good disassembler if you want to hand code the machine code (the disassembler double checks you have the machine code you thought you wanted). Or you could have an option in the simulator to output disassembled code in execution order, good for debugging anyway.

Dont worry about interrupts at first, but be aware that you are going to need to have a mechanism for stopping the flow of the simulation, save state and re-use the simulator for so many clock cycles. You will see a lot of simulators essentially single step each instruction. Evaluating the number of clock ticks on each instruction for hardware reasons either for periodic interrupts, or other simulated hardware that needs to be in sync with the simulated processor. So it would be a good ideal to start by modelling yours that way and then coming up with performance improvements later as needed.

If you have not programmed using state machines I highly recommend learning a little about them and writing some practice programs.

I am working on a project I hope to publish that if I ever finish may help with topics like this. My instruction set simulators these days are actually written in a hardware design language (verilog for example, not the hdl I use but it converts to verilog) then use something like verilator with a C/C++ wrapper. So the work related to extracting opcodes and bits is handled by a language designed to manage just that. And the parts that want to simulate peripherals or provide guis or other ways of displaying what is going on are handled by a language designed for that. If you happen to be working on a project where this new instruction set is realized in vhdl or verilog I highly recommend a hybrid hdl simulator plus software solution either through VPI if it is a commercial hdl simulator or through other simpler methods if you go with verilator or icarus verilog or ghdl or others. (icarus may need the commercial like vpi interface). The beauty here is that you are closer to the real hardware, bugs/warts, etc in the hardware can be found and dealt with before going to silicon, changes or improvements in the hardware are instantly realized in the simulation instead of having to make matching changes or improvements in the simulator, etc. The hdl simulators have scripty languages, or system C, etc, which are fine for getting the hardware functionally tested but you cant run it on silicon. If you discourage the use of these tools and have real binaries run on the simulated logic, and provide an abstraction layer to real tools to load or debug or monitor (for example write a back end to openocd to pass through the host to hdl sim layer and talk to the simulated jtag debugger on the chip, depending on the complexity of this design(quite doable as I have done it myself)). These real programs can be run both in simulation and on silicon, saving time in writing tests, and getting test software written well before silicon arrives instead of starting after it arrives.

Sorry for the HDL tangent. If this is a new or otherwise without toolchain instruction set you are going to have to spend time on that toolchain, at least an assembler and linker and I would do some sort of disassembler and then turn that disassembler or portions of it into the first cut at the simulator. As far as standard things I have seen in instruction set simulators, basically think in terms of single stepping so that you can handle clocks and interrupts and other simulated hardware. the core function of the simulation either simulates one clock cycle which may not be a complete instruction (a state machine approach) or the core function simulates a single instruction and returns modifying registers, flags and memory as demanded by the instruction (and counting simulated clocks). Depending on how feature rich you want this simulator from the very beginning make your reads and writes of ram/rom and registers functions so that you can in one place trap memory and register changes for displaying to the user. Only access or at least modify registers and ram through these abstract functions. Beyond that it is just a ton of typing, the more complicated the instruction set, the more instructions the more lines of code to type.

Upvotes: 5

oldisgold

Reputation: 31

the starting point is the instruction set itself & the internal architecture of the processor [sometimes called 'prograaming model'] & what are the addressing modes.
thorough knowledge of what each instruction is expected to do [inside the processor] & what output the processor is expected to send out [to memory or i/o port] for that instruction
design an instruction format for the machine code instruction: how many bits/bytes should be there in the machine code of the instruction; out of that, how many bits constitute the opcode, how many bits for indicating the addressing mode, how many bits for indicating the operand [register, data or address] etc. have to be decided. also decide the order & combinations of these groups of bits. It will be convenient for u to keep the total no. of bits in the machine code instruction as a multiple of 8 [that is a machine code instruction whose size is 1 or 2 or 3 bytes is more convenient for u than 12 bits or 17 bits].
in the simulator software [which u may write in c/cpp/java or whatever is your favourite language], u will be assigning variables to represent the cpu's internal registers; the job of ALU will be done by the functions in your SW; your SW will read the machine code instructions from a binary/ascii-hex file; each instruction is to analysed based on what the different groups of bits indicate]; then call the necessary function(s) based on these indications.
design a good user interface for the simulator sw. I guess u will probably use the simulator for demo/teaching/training. display the actions of the sw [the actions going on inside the cpu] as the actions are in progress; for example, u may display the stack memory locations or the registers when data is written or read from these locations. it will help in better understanding of the cpu operations by the users. more important, it will help u when u develop/test the simulator sw, to easily locate mistakes.

Upvotes: 3

thkala

Reputation: 86333

The first step would be to create an assembler for your CPU. The assembler has no relation whatsoever to the simulator and in my opinion including an assembly parser in the simulator would needlessly complicate things.

Using a separate assembler keeps things modular and it allows your simulator to have simple binary code as its input - code that you can even create, verify and modify separately. It is also in accordance with the current practices.

As for your simulator, searching Google for writing a CPU simulator has this as a first hit and there are quite a few more interesting results.

Upvotes: 4

What are the main steps to write an Instruction Set simulator?

Answers (3)

Related Questions