Reputation: 28703
I'm looking for a simple, formally defined language that can be used while learning about compiler construction. It should be simple to implement a first pass and then be amenable to further optimization efforts.
Feel free to point me in the direction of lisps, but I'm specifically looking for other options as well.
Upvotes: 6
Views: 821
Reputation: 3519
May I suggest the Jack programming language from http://www.nand2tetris.org/?
It's especially suited for learning compiler construction, as it's part of an academic course.
I am in the midst of a blog post series on writing a compiler for this language, in C#, with code-generation to C. The posts I have already are here: http://blogs.microsoft.co.il/blogs/sasha/archive/tags/Compiler/default.aspx
Upvotes: 5
Reputation: 21
I would suggest Wirth's PL/0.
Why?
The grammar is small, but still there is enough there to get a good taste for developing a compiler:
program =
block "." .
block =
[ "const" ident "=" number {"," ident "=" number} ";"]
[ "var" ident {"," ident} ";"]
{ "procedure" ident ";" block ";" } statement .
statement =
[ ident ":=" expression
| "call" ident
| "begin" statement {";" statement } "end"
| "if" condition "then" statement
| "while" condition "do" statement
].
condition =
"odd" expression
| expression ("="|"#"|"<"|"<="|">"|">=") expression
.
expression =
[ "+"|"-"] term { ("+"|"-") term} .
term =
factor {("*"|"/") factor} .
factor =
ident | number | "(" expression ")" .
You can implement a Virtual Machine compiler for PL/0 in C in about 1000 lines of code.
There are three books related to it:
Wirth, Niklaus (1975), Algorithms + Data Structures = Programs, ISBN 0-13-022418-9 (the original PL/0 specification and implementation (in Pascal)) A great gentle introduction to compiling.
Liffick, Blaise W., Ed (1979), The Byte Book of Pascal, ISBN 0-07-037823-1 (the authors develop a minor superset of PL/0, in Northstar Basic for an early CP/M computer).
Wirth, Niklaus (1986), Compilerbau, B.G. Teubner, Stuttgart ISBN 3-519-32338-9 (a minor superset of PL/0, implemented in Modula 2. In German).
The Web is full of examples.
There is a Wikipedia entry: :-)
In addition, a couple of helpful groups, with lots of people willing to help answer your compiler writing questions:
Also, the Usenet newsgroup comp.compilers is a good place to get information.
Upvotes: 2
Reputation: 5473
I thought Chapter 8 of Kernighan and Pike's The Unix Programming Environment was excellent. It covers much of programming in the Unix environment, all while implementing a programming language.
Chapter 8 is called Program Development. It discusses the development of a non-trivial program through various stages of design. That non-trivial program is hoc, the High-Order Calculator. For more details on hoc, see http://en.wikipedia.org/wiki/Hoc_(programming_language)
It is a great, practical introduction to implementing a simple language using the standard tools yacc and lex. yacc and lex are too much to cover here, but by following the examples in this book and doing the exercises you will develop an understanding of them.
The development lasts through various phases; in the first phase you don't even have variables in the language. By the third stage you have variables, defined constants (PI, E, etc.), and built-in functions like sin() and log(). By the last stage you have the fully implemented language.
Now, is hoc the best language to try and implement? I have no clue, but I know that The Unix Programming Environment was an excellent book to read in parallel with a traditional compiler book. As I began reading the Aho compiler book (dragon book) I re-read chapter 8 of TUPE and followed the examples and exercises. Sure, anyone can re-type the code from the book, but the exercises require that you have a good understanding of what is going on.
In the end I don't think it matters exactly which language you choose to do, but the process you follow while implementing it.
Upvotes: 3
Reputation: 9715
Oberon specification is small enough for your purposes: http://www-vs.informatik.uni-ulm.de:81/projekte/Oberon-2.Report/
R5RS or a pure functional subset of it is not that big too (if you ignore the numeric tower).
Upvotes: 1