Reputation: 255
I'm trying to write a programming language and being stuck at code generation phase.
After thorough consideration, I decide to use LLVM as my back-end because I don't want to deal with obscure low-level stuff (generating assembly is fine to me, but I need more knowledge on linking to accomplish my work).
One stumbling block is that my work is not on C++. It means I could not use ready LLVM classes in my code.
Could I generate LLVM IR code in the form of characters string, save it to file (or no need?) and then compile it? In the case I could, is there any other form that I can generate to help LLVM run faster?
Special thanks to any advice.
Upvotes: 4
Views: 1328
Reputation: 2004
Copying the LLVM documentation
Your compiler front-end will communicate with LLVM by creating a module in the LLVM intermediate representation (IR) format. Assuming you want to write your language’s compiler in [something else than C++], there are 3 major ways to tackle generating LLVM IR from a front-end:
Call into the LLVM libraries code using your language’s FFI (foreign function interface).
- for: best tracks changes to the LLVM IR, .ll syntax, and .bc format
- for: enables running LLVM optimization passes without a emit/parse overhead
- for: adapts well to a JIT context
- against: lots of ugly glue code to write
Emit LLVM assembly from your compiler’s native language.
- for: very straightforward to get started
- against: the .ll parser is slower than the bitcode reader when interfacing to the middle end
- against: it may be harder to track changes to the IR
Emit LLVM bitcode from your compiler’s native language.
- for: can use the more-efficient bitcode reader when interfacing to the middle end
- against: you’ll have to re-engineer the LLVM IR object model and bitcode writer in your language
- against: it may be harder to track changes to the IR
The option you mention is number 2.
Upvotes: 4
Reputation: 34391
Could I generate LLVM IR code in the form of characters string, save it to file (or no need?) and then compile it?
This approach is being used by some projects (GHC, for instance), but it is not recommended. Instead, you can use LLVM C bindings. Interfacing with C is a common feature for many languages, so it shouldn't be a problem.
Upvotes: 1