Reputation: 369
Please consider this string:
string mySelects = @"SELECT * FROM TABLE1 WHERE 1=1
SELECT T.*
FROM (SELECT * FROM TABLE2 WHERE 2=2) T
ORDER BY FOO
SELECT FIELD1, FIELD3
FROM TABLE3
GROUP BY FIELD";
Is there any regular expression to return a MatchCollection with these three Selects in each result?
Thanks in advance!
Upvotes: 0
Views: 689
Reputation: 25927
There is none.
Mathematically speaking, a regular expression describes some set of strings (these, which match this regular expression). Similarly a grammar describes a set of strings (which match this grammar). But grammars are a superset over regular expressions, because every regular expression can be described by a grammar, but unfortunately not the other way around.
SQL is a language, which is described by some grammar. But this grammar is way too complicated to be described by a regular expression. You need some more mathematical firepower to process this language.
The solution is to seek for a ready-made SQL parser or write one using tools like ANTLR.
Upvotes: 1
Reputation: 726579
There is no regex like that, and it is not possible to write one: the regex programming model is not powerful enough to deal with counting. It is not powerful enough even to balance parentheses in an expression - something that would be required to capture the third select
in your example. The fact that .NET's regex engine supports balancing groups is not going to help you much: regex is hopelessly inadequate for the task at hand.
You need something more powerful than the regex - preferably, a complete SQL parser. I have tried this parser (registration is required), it worked fine on many queries. It also has a source in case you wish to adjust it to your needs.
Upvotes: 2