Reputation: 29919
I have the following regex:
Regex defineProcedureRegex = new Regex(@"\s*(\bcreate\b\s+\bprocedure\b\s+(?:|dbo\.))(\w+)\s+(?:|(.+?))(as\s+(?:.+?)\s+\bgo\b)\s*", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.RightToLeft | RegexOptions.Singleline);
that I am running against a SQL script file containing multiple "create procedure" statements. The format of the file is like this:
use databasename
create procedure dbo.procedure_name
@param1 varchar(10) -- optional
as
-- do stuff
go
use databasename
create procedure dbo.another_procedure
@param1 varchar(10) -- optional
as
-- do other stuff
go
The problem I have is that I need to match the first as
, but only the last go
. Since the procedure body may also contain as
and go
(within comments), I can't figure out how to do this reliably.
Any help/comments/advice?
Upvotes: 0
Views: 668
Reputation: 536605
Since the procedure body may also contain as and go (within comments)
(and within string literals and identifiers, yes).
You would have to do a greedy match up to go
. But, that would match from the start of the first stored procedure in the source up to the end of the last one!
You could use a negative match to ensure that the greedy match didn't go over a ‘create procedure’ boundary:
(as\s+(?:(?!create\s+procedure).)+?\s+\bgo\b)
however this is still not watertight, because you might also have create procedure
in a comment or string literal.
Conclusion: regex is an inadequate tool for parsing a complex, non-regular language such as SQL. You will need a proper SQL parser. This is not a simple job. See eg. this question.
Upvotes: 2
Reputation: 57976
Try this:
create procedure (?<schema>.*?)\.(?<name>\w+)(?<params>[\s\S]*?)?as[\s\S]*?go
Note I'm using a non-greedy pattern, i.e. *?
Upvotes: 0