Reputation: 48697
I'm writing an x86 backend for a compiler and am finding it really tedious encoding the machine code for each assembly instruction I need and I'm obviously reinventing the wheel. Is there a declarative representation of this instruction set anywhere, e.g. an XML file mapping instruction operations and operands to bytes?
Upvotes: 4
Views: 569
Reputation: 1351
I am assuming below that you don't want to depend on something huge like LLVM at runtime.
The reason I have researched this question is that I want to add a machine code emitter to a self-hosting Lisp whose size is in the ballpark of 2000-3000 LoC. Settling with the current LLVM dependency, or the Gnu assembler dependency, would invalidate the very ideal behind this project: self-host from as little code as feasible.
Here's what I have found for now:
It's not trivial, and to generate your own code from its declarative description you'll need to write C++ code (unless you're ready to parse and process its format yourself). It's comprehensive, but not the simplest.
https://llvm.org/docs/TableGen/index.html
lib/Target/X86/X86InstrInfo.td
llvm-tblgen-10 --help
This is basically a web of #define
C macros that can be processed relatively simply (see a Lisp example).
Here are the copies/versions that I have found online:
https://github.com/cebix/macemu/blob/master/BasiliskII/src/uae_cpu/compiler/codegen_x86.h
https://unix.superglobalmegacorp.com/previous/newsrc/src/cpu/jit/codegen_x86.h.html
https://github.com/probonopd/previous/blob/master/src/cpu/jit/codegen_x86.h
Upvotes: 1
Reputation: 4220
I highly recommend using DynASM for this. It's not a declarative description, but it gives you absolute control over what instructions are emitted, and it's much easier to use than a declarative description would be. It's the ideal way of writing a platform-specific codegen IMO.
It is also very small and unimposing: the runtime is completely contained within a few hundred lines of .h
files.
See my DynASM tutorial for an example of writing a very simple codegen with DynASM.
Even if you're not convinced about DynASM, you'll find in the DynASM codebase a pretty concise declarative description of x86 instructions, which you might find useful.
Upvotes: 4