Michael0x2a
Michael0x2a

Reputation: 64188

Implementing goto in an ast

Background: As a short project over winter break, I'm trying to implement a programming language called Axe (designed for graphing calculators) using Python and PLY. A brief note: the language allows only global variables and makes heavy use of pointers.

I'm trying to implement goto in this language, but have no idea how to do it.

My general method is to first use PLY to parse the code into an ast, then walk through it executing as I go.

For example, the statement

If 3
    Disp 4
    Disp 6
End

...would turn into...

['PROGRAM', 
  ['BLOCK', 
    ['IF', 
      ['CONDITION', 3], 
      ['BLOCK', 
        ['DISP', 4], 
        ['DISP', 6]
      ]
    ]
  ]
]

...which I would execute recursively (I added indents for readability).

Because the ast is a tree, I'm not sure how to jump between different nodes. I've considered perhaps converting the tree to a flat-ish array ['IF', ['CONDITION', 3], ['DISP', 4], ['DISP', 6]] so that I can use the indices of the flat-ish array to go to specific lines in the code, but this seems to lack a certain elegance and almost feels like a step backwards (although I could be wrong).

I've looked at this, but was unable to understand how it worked.

Any help or hints would be appreciated.

Upvotes: 9

Views: 1739

Answers (4)

Tobias Gierke
Tobias Gierke

Reputation: 470

I was just facing the same problem and came up with the following solution (assuming the language you're using to implement your interpreter supports exceptions).

  1. Make sure your AST nodes can be uniquely identified (unique ID, pointer/reference value, etc.)
  2. When constructing your AST, make sure you have some way to represent a (block) scope, either explicitly with a "Block" node or implicitly via some "isBlock(node") function
  3. When encountering a 'goto', resolve the destination AST node. Then pass the identifier (or even the AST node instance itself) along with your exception (in Java this can be done with a member field in a custom Exception subclass, in other languages you may need to encode it in the exception message string etc.).
  4. When interpreting a block scope AST node, loop over the children of this block and have this loop wrapped in a try/catch(GotoException). When you catch a GotoException, check whether the target AST node is a direct child of the block scope that caught it. If the target is a direct child of the current block scope, restart execution at that node. If not, re-throw the exception so the next block scope higher up the call stack can have a look at it.

Definitely not very performant but much easier to implement than "flattening" your AST (if you start going down that path you might as well write a compiler instead of an interpreter).

Upvotes: 0

0 _
0 _

Reputation: 11504

You can also flatten the AST to a directed graph (the control-flow graph). An example of how this can be done to produce a networkx graph that can be traversed by the interpreter can be found in the Python package promela. Note that you'll have to write some AST classes for that purpose.

Upvotes: 0

S.Lott
S.Lott

Reputation: 391952

I've considered perhaps converting the tree to a flat-ish array ... but this seems to lack a certain elegance and almost feels like a step backwards (although I could be wrong).

You are wrong. Machine code is always flat. Languages like C are flattened to create machine code.

A calculator (like other simple machines) is flat.

However. Flattening out your AXE syntax tree is not completely necessary.

You simply need to apply the programming source labels to each node in your tree.

Then a "GOTO" simply searches the tree for that label and resumes execution at that label.

Upvotes: 3

Aaron Digulla
Aaron Digulla

Reputation: 328724

"execute recursively" doesn't fit well with goto. For goto to work, you need a PC, a "program counter" and each statement in your program must have a distinct address. As it is executed, the address of each statement is assigned to the PC. When goto is encountered, the target address of the goto (it's argument) is put into the PC and the execution resumes from there.

This is almost impossible to achieve with a stack-based, recursive approach. You have two options:

  • Flatten your AST into a sequence where you can assign a distinct address to each statement

  • Add a "skip" mode to your interpreter. When goto is encountered, throw a GotoException which breaks out of all of the stack frames and goes back to the root. Process statements (skip them without executing) until you reach the target address.

I think you can imagine that this implementation of goto isn't very performant and probably brittle to implement.

Upvotes: 8

Related Questions