Translating Python for a Virtual Machine

Question

I am currently working on a project involving a new faster execution environment/VM for Python on Linux. The python source is parsed into an intermediate AST, analyzed and the code for the target VM is generated JIT and cached. Due to the JIT nature of the proposed machine, speed is essential and I am writing it as native as possible. Currently its completely implemented in C apart a single python interface to the compiler module. Currently I am able to build the AST using the Python compiler module and have it in memory.

For example the code:

class Test:
    def testFunc(arg1):
        print 'Arg is ' + arg1

generates the AST

Module(None, Stmt([Class('Test', [], None, Stmt([Function(None, 'testFunc', ['arg1'], [], 0, None, Stmt([Printnl([Add((Const('Arg is '), Name('arg1')))], None)]))]), None)]))

What I want to know is a efficient method to parse this AST into a manipulable data structure like a tree, which can be traversed and the target code emitted. I am confused as to whether to go for a parser generator like Bison or Lemon or manually tokenize and parse it. Since the AST is obtained after extensive error checks, so no point of further error checks, hence I believe a parser generator is overkill. Python itself provides AST walkers but it slows it down. But then I am really not too sure how to go about manually deciphering it. I would really appreciate any algorithm or suggestion or if possible a native language implementation.

Gareth Rees · Accepted Answer

Python already has a fast parser (see Parser/parser.c in the Python sources). You create a parser by calling PyParser_New and send it tokens by calling PyParser_AddToken. It builds a tree of node objects (see Parser/node.h):

typedef struct _node {
    short        n_type;
    char         *n_str;
    int          n_lineno;
    int          n_col_offset;
    int          n_nchildren;
    struct _node *n_child;
} node;

So if the ast module is too slow, use the C interface and process the parse tree directly.

Translating Python for a Virtual Machine

Answers (1)

Related Questions