Reputation: 184091
I'm looking to write a Python import filter or preprocessor for source files that are essentially Python with extra language elements. The goal is to read the source file, parse it to an abstract syntax tree, apply some transforms in order to implement the new parts of the language, and write valid Python source which can then be consumed by CPython. I want to write this thing in Python and am looking for the best parser for the task.
The parser built in to Python is not appropriate because it requires the source files be actual Python, which these will not be. There are tons of parsers (or parser generators) that will work with Python, but it's hard to tell which is the best for my needs without a whole bunch of research.
In summary, my requirements are:
Any suggestions?
Upvotes: 14
Views: 3601
Reputation: 1430
I would recommend that you check out my library: https://github.com/erezsh/lark
It can parse ALL context-free grammars, automatically builds an AST (with line & column numbers), and accepts the grammar in EBNF format, which is considered the standard.
It can easily parse a language like Python, and it can do so faster than any other parsing library written in Python.
In fact, there's already an example python grammar and parser
Upvotes: 7
Reputation: 77251
I like SimpleParse a lot, but I never tried to feed it the Python grammar (BTW, is it a deterministic grammar?). If it chokes, PLY will do the job.
See this compilation about Python parsing tools.
Upvotes: 2
Reputation: 601441
The first thing that comes to mind is lib2to3
. It is a complete pure-Python implementation of a Python parser. It reads a Python grammar file and parses Python source files according to this grammar. It offers a great infrastructure for performing AST manipulations and writing back nicely formatted Python code -- after all it's purpose is to transform between two Python-like languages with slightly different grammars.
Unfortunately it's lacking documentation and doesn't guarantee a stable interface. There are projects that build on top of lib2to3
nevertheless, and the source code is quite readable. If API stability is an issue, you can just fork it.
Upvotes: 9