Reputation: 1137
I'm a Comp Sci major, and am working toward creating my own programming language as one of my projects this semester. But I'm stuck on one part of the compilation stage.
As I understand, this is the process of compilation from a language such as C#.
C# source code -> compiled into IL by .NET -> Executed by CLR (converts the IL to native machine code as needed - apparently it doesn't do it all at once), which leaves you with a .exe program.
The thing I'm confused about is the stage of going from Intermediate Language (IL), to native machine code.
I mean, I guess I could write my language for .NET (mylang.NET), such that it targets the CLR, and has it do the rest of the work, in that case I'd only need to learn Intermediate Language. But still, I'd be interested in making an unmanaged language as well, which isn't bound by the restrictions of a managed one. (Unmanaged like C++). But then that means I'd need to somehow know every possible version of machine code I'd encounter..
Any help would be greatly appreciated!
Katie
Upvotes: 0
Views: 1020
Reputation: 3871
When you make a compiler you choose which architectures to target. For a compiler built into a virtual machine such as CLR or JVM the choice is simple you target the architecture the VM is running on, i.e. a CLR running on your PC will generate x86 code and a JVM running on your phone will generate Arm code.
For a stand-alone compiler the choice is more arbitrary and depends on where you want the code to run. If you want the code to run on your PC you generate x86 code, for your phone you generate Arm code, for your dishwasher you probably generate 8051 code and for the engine controller in your car you generate code for something like an RH850.
When you decided the target architectures of your compiler you need to build a code-generator for each of them. Typically you focus on the main architecture and only introduce model-specific tweaks if the gain is big, i.e. you construct an Arm compiler not a Cortex-A9 compiler. This implies that a compiler that can generate code for a number of different architectures will contain one code generator for each architecture while all different versions of the same architecture is handled by one code generator.
Building a code generator requires an understanding of the target architecture and how it works. Thus, trying to build a compiler without learning the target architecture first is doomed to failure. My personal recommendation for you project is that you choose one specific architecture and target this. Personally I would choose x86 since then you can test the generated code on your pc.
Upvotes: 2