Reputation: 21
I have a program that goes through SQL statements, identifying component parts with a space. eg: -
SELECT * FROM tblSales WHERE CustomerID=10 AND Year=2011
Would produce the following as separate components: -
"SELECT","*","FROM","tblSales","WHERE","CustomerID=10","AND" and "Year=2011"
The problem I have however is the use of spaces within values eg.:-
SELECT * FROM tblSales WHERE CustomerNameID='Test Company' AND Year=2011
Using the same space-separating logic, this would produce components of: -
"SELECT,"*","FROM","tblSales","WHERE","CustomerID='Test","Company'","AND" and "Year=2011".
ie the space between "Test" & "Company" causes it to be treated as two separate components.
Question is, Is it possible via RegEx.Replace to replace the spaces between the quotation marks with another character whilst leaving all other spaces in the string in tact?
Shaun.
Upvotes: 1
Views: 478
Reputation: 7507
Possible? Maybe. Maybe not. Certainly not trivial. Parsing SQL with regexes is almost as bad, or quite possibly worse, than parsing HTML with regexes...
You might want to poke around SO and look for other people who have tried to parse SQL...
Upvotes: 1
Reputation: 336478
Assuming you don't have any escaped quotes, you could simply check if the space you're looking at is followed by an even number of quotes (which means it's outside of a string).
So
splitArray = Regex.Split(subjectString, " (?=(?:[^']*'[^']*')*[^']*$)");
gives you the desired result.
As a verbose regex:
[ ] # Match a space
(?= # only if it's followed by...
(?: # This subregex:
[^']*' # any number of non-quote characters, followed by a quote
[^']*' # same again (to ensure even number of quotes)
)* # zero or more times.
[^']* # and any number of non-quote characters until...
$ # the end of the string.
) # End of lookahead.
Upvotes: 1