Reputation:
I'm looking for good SQL parser. One that will work with subselects, non-select queries, CTE, window functions and other legal SQL elements.
Result would be some kind of abstract syntax tree, that I could later on work on.
Language is mostly irrelevant, as I am willing to learn new language just to use the library, if it exists.
I know that it is technically possible to extract parser from some open source database, but it's far from easy (at least for the parser of PostgreSQL which is what I need).
Upvotes: 4
Views: 1795
Reputation: 95334
Our DMS Software Reengineering Toolkit has PL/SQL and ANSI SQL 2011 full parsers (to ASTs) and prettyprinters (ASTs back to valid text). Neither of these are PostGres SQL, but DMS has a dialect mechanism that enables one to relatively easily build a dialect from a base grammar, by revising just some of the grammar rules and retaining the rest. Doing this from the SQL 2011 grammar seems like a practical way to tackle the problem.
DMS also offers facilities to access/traverse/modify the ASTs, both procedurally and in terms of surface-syntax patterns and transformations. Think of this as "life beyond parsing".
Upvotes: 0
Reputation: 483
You can google "sql parser". This is the one that listed: General SQL Parser Here are some highlighted features listed on official website:
It's a commercial SQL library.
Upvotes: 1
Reputation: 691
There's a non-validating SQL parser in Python: python-sqlparse. The tokens are exposed as objects. I doubt if they support "other legal SQL statements", window functions, and the like though as those are controlled by vendor specific grammars and no vendor is technically fully compliant with SQL standards.
Um (knowing that you're willing to learn a new language), why would you need to work on the syntax tree? If you need some magic in dealing with the database, probably you don't need to reinvent the wheel: Python got a fantastic database toolkit - SQL ALchemy.
Upvotes: 2