Reputation: 18861
Our company uses own (built in here) scripting language for programming, but they would like to create interpreter that will translate this script codes to Java. This scripting language is quite serious, so it's no small thing.
I've been asked about this task, but it doesn't seem like trivial challenge. Now before I do anything stupid and start writing billions of lines of parsing, what should I know? Where should I start to make this properly?
PS: I want to translate script files to .java sources, not directly to bytecode.
Upvotes: 4
Views: 9228
Reputation: 2834
If you want to translate your script to Java, it's not an interpreter, but a compiler. If you are thinking about just executing the script during reading, it is interpreter.
However, you should look at JavaCC or Antlr. They are both suitable even for compiling or interpreter tasks. You have to specify the language's syntax rules and you have to write some additional logic in Java, implementing semantics of your script language. If you want to make an interpreter, the Java code you write, will generate further Java (or any) code. If you want a compiler, the Java code you write will directly execute the script.
One more concept to good to know about is Abstract Syntax Tree.
Here is a comprehensive list about more lexer and parser generators.
Upvotes: 5
Reputation: 220
I'd recommend you to get a book on wrting compilers/interpreters in java. Thre are quite some ie: Writing Compilers and Interpreters
It's better to see the big picture first before starting off with lexer/parser etc
Or if you want to jump in directly try antlr
Upvotes: 0
Reputation: 3090
I recommend you to use antlr java library that is used for Language Recognition. It's the same library used with most of JVM languages. I have not used it personnaly but I know that Groovy was built using this library.
Upvotes: 0
Reputation: 50147
It sounds like an interesting task :-) Could you describe the scripting language a bit?
I would look at the package javax.script
, possibly there is a similar scripting language (I know about Scala used as a scripting language). Also, I would look at javax.tools.JavaCompiler
. I'm building a Java source generator right now (to create and compile a class proxy at runtime). Generating Java source code is a lot easier than generating bytecode, that is for sure.
As for parsing, I would first create a good BNF for your language. There is a tool to generate HTML railroad diagrams out of that. You will make mistakes when writing the BNF, but you will find them if you look at the railroad diagrams. And it will ensure you don't make something that can't be parsed.
I know most people will suggest to use ANTLR or JavaCC, but I would write your own recursive-descend parser, because I think it's easier and more flexible (I have done both a few times and know what I talk about). One example is the Jackrabbit SQL-2 parser.
Upvotes: 3