JRR
JRR

Reputation: 6152

Implementing language translators in racket

I am implementing an interpreter that codegen to another language using Racket. As a novice I'm trying to avoid macros to the extent that I can ;) Hence I came up with the following "interpreter":

(define op (open-output-bytes))
(define (interpret arg)
  (define r
     (syntax-case arg (if)    
       [(if a b) #'(fprintf op "if (~a) {~a}" a b)]))
       ; other cases here
  (eval r))

This looks a bit clumsy to me. Is there a "best practice" for doing this? Am I doing a totally crazy thing here?

Upvotes: 1

Views: 387

Answers (2)

soegaard
soegaard

Reputation: 31147

If your source language shares syntax with Racket, then use read-syntax to produce a syntax-object representing the input program. Then use recursive descent using syntax-case or syntax-parse to discern between the various constructs.

Instead of printing directly to an output port, I recommend building a tree of elements (strings, numbers, symbols etc). The last step is then to print all the elements of the tree. Representing the output using a tree is very flexible and allows you to handle sub expressions out of order. It also allows you to efficiently concatenate output from different sources.

Macros are not needed.

Upvotes: 2

John Clements
John Clements

Reputation: 17203

Short answer: yes, this is a reasonable thing to do. The way in which you do it is going to depend a lot on the specifics of your situation, though.

You're absolutely right to observe that generating programs as strings is an error-prone and fragile way to do it. Avoiding this, though, requires being able to express the target language at a higher level, in order to circumvent that language's parser.

Again, it really has a lot to do with the language that you're targeting, and how good a job you want to do. I've hacked together things like this for generating Python myself, in a situation where I knew I didn't have time to do things right.

EDIT: oh, you're doing Python too? Bleah! :)

You have a number of different choices. Your cleanest choice is to generate a representation of Python AST nodes, so you can either inject them directly or use existing serialization. You're going to ask me whether there are libraries for this, and ... I fergits. I do believe that the current Python architecture includes ... okay, yes, I went and looked, and you're in good shape. Python's "Parser" module generates ASTs, and it looks like the AST module can be constructed directly.

https://docs.python.org/3/library/ast.html#module-ast

I'm guessing your cleanest path would be to generate JSON that represents these AST modules, then write a Python stub that translates these to Python ASTs.

All of this assumes that you want to take the high road; there's a broad spectrum of in-between approaches involving simple generalizations of python syntax (e.g.: oh, it looks like this kind of statement has a colon followed by an indented block of code, etc.).

Upvotes: 2

Related Questions