What defines how AST should be transformed into the target language

The structure of an AST is defined by the syntactical grammar. I'm wondering what defines how AST should be transformed into a target language. Is it the syntactical grammar of the target language?

Upvotes: 2

Answers (2)

Ira Baxter

Reputation: 95334

The real answer is that the language semantics drive the translation.

Your source and target languages each have syntax, and each bit of syntax has a semantic interpretation.

What the translator has to do is find configuration ("patch") of syntax in the source language, and a corresponding configuration of syntax int the target language, such that the semantics of the source language patch is preserved. If you can break the original source program into a large number of such patches, such that each patch translates preserving semantics, and the patches have consistent boundaries, then you have a translation. That's pretty abstract and the actual math to say it is worse but the intuition is really right.

Some tools (I build one, see my bio) actually let you write translation rules that express exactly these "patches". We can them "rewrite" rules; what they let you express is a fragment of syntax in the source language, and fragment of syntax in the target language, and assert that the latter is a semantically valid translation of the former. So, you can literally build a translator by writing a large set of such rules. [You get a large set because real languages have lots of syntax combinations and semantic interpretations].

The rules are a bit more complicated. There's an additional arbitrary predicate attached that verifies that indeed the semantics gets preserved. What this means in practice is that pure syntax trees are not enough. Often that predicate needs to test some concepts which are not trivial to extract from the syntax trees. Rather than building hugely complex predicates to do nontrivial fact extraction, what most practical compilers do is precompute information that makes it easy to implement such predicates. You can see this in real compilers in the form of symbol tables, control and data flow graphs, and other classic compiler structures.

If you ask, I can provide some links to examples but they are specific to our tools.

Upvotes: 3

sepp2k

Reputation: 370142

The structure of an AST is defined by the syntactical grammar.

Sort of. The parse tree (or concrete syntax tree) corresponds directly to the structure of the grammar. The abstract syntax tree is an abstraction of the parse tree, which leaves out irrelevant parts. What is and isn't irrelevant also depends on the semantics of the language.

For example the expressions (x) and x produce the same AST in most implementations of most languages, but you can only do that if those two are equivalent according to the semantics of the language. In Lisp, for example, (x) would be a function call of the function x, whereas x would be an access of the variable x. So that's an example where different semantics lead to a different AST.

I'm wondering what defines how AST should be transformed into a target language. Is it the syntactical grammar of the target language?

The syntactical grammar of the target language is only relevant in so far that one basic necessary requirement for compiler correctness is that all the code you generate must fit that grammar (otherwise you're generating invalid code). However that requirement is hardly sufficient.

Another very important requirement for correctness is that, given an input program with well-defined semantics, the generated program's semantics must be also well-defined and equivalent.

So the logic to generate a valid target program from an AST depends on the source language's semantics as well as the target language's syntax and semantics.

Upvotes: 3

What defines how AST should be transformed into the target language

Answers (2)

Related Questions