Mahdi
Mahdi

Reputation: 2305

What's the smallest subset of language features you need to bootstrap its compiler?

What would be the absolute necessary core features of a language (inspired from C) as a sub-language which can be used to write a compiler for the whole language?

Upvotes: 2

Views: 1141

Answers (3)

Jörg W Mittag
Jörg W Mittag

Reputation: 369556

You need a while loop, if, one true integer variable, and a way to read and write a file. That's it. (Actually, the file reading and writing part is nice-to-have, but not strictly necessary – You only need it to get information into and out of the program. And if you can read and write a file, then you don't need the integer variable anymore, since you can use the file as a temporary store.)

while, if and one integer variable is Turing-complete, i.e. it can compute any Turing-computable function. A compiler is a Turing-computable function. A compiler which cannot take any input or produce any output is pretty boring, so you need to have a way to read some input and write some output.

Upvotes: 5

Ira Baxter
Ira Baxter

Reputation: 95402

You can define bootstrappable metacompilers consisting of 20 odd lines that can do this. The MetaII compiler is a particularly good example from 1963. I have bootstrapped much bigger compilers from MetaII as a base back in the 1970s.

These meta compilers require the ability to parse a metacompiler description (especially their own so bootstrapping is possible), that defines an EBNF syntax (test for input string, scan next token, ...) and a set of generator actions (output literal string, output last token scanned, output generated label). You can implement a library in practically any language to implement the support for this in typically a few hundred lines in any procedural language.

Here is MetaII's self description, taken directly from the original paper: MetaII Self Description.
Yes, that's the whole damn thing. (Exercise for the really motivated reader: you can simplify the minimal supporting instruction set, and this description).

Here is a beautiful tutorial on how to build/understand this gem in JavaScript: MetaII Tutorial.

Doug Michels, a grad student at Santa Cruz back in the 1980s, took this to an extreme. If you encode language tokens as single characters, you can define a bootstrappable metacompiler that self-describes in 80 characters. You have to get the thesis from Santa Cruz if you want to see the details.

Upvotes: 3

OmegaNerd
OmegaNerd

Reputation: 96

There are two ways to interpret your question: as a theoretical computer science question; and, as a practical engineering question.

There's already an answer that leans toward the theoretical answer. So, I'm going to go more to the practical side.

I think you'll need integers, pointers, variables, an if statement, a loop statement, and functions. As the other post points out, you'll need to have some way of reading from a file to get the source code to compile and writing to a file to save the generated assembly or object code.

I'd suggest you look into the Small C compiler. It's a compiler for a subset of C that is able to compile itself. If you look at the Wikipedia page for Small C, you'll see some books have been published about the compiler. While the books are out of print, you may be able to find one available second hand.

Upvotes: 1

Related Questions