Reputation: 5578
I would like to use a regular expression in OpenEdge to replace parts of a string.
In this case we want to replace SQL expressions with a symbol (now we use ?).
So what I have so far is this:
USING Progress.Lang.*.
USING System.Text.RegularExpressions.*.
BLOCK-LEVEL ON ERROR UNDO, THROW.
CLASS TOOLS.externals.TextSanitizer:
CONSTRUCTOR PUBLIC TextSanitizer ( ):
SUPER ().
END CONSTRUCTOR.
METHOD PUBLIC CHARACTER SanitizeSql( INPUT pInputText AS CHARACTER ):
DEFINE VARIABLE RESULT AS CHARACTER NO-UNDO.
DEFINE VARIABLE regexp AS Regex NO-UNDO.
regexp = NEW Regex("^((INSERT)\s+(INTO).*((VALUES)|(SELECT))|(SELECT).*(FROM)|(UPDATE).*(SET)|(DELETE)\s+(FROM)|((CREATE)|(DROP)|(TRUNCATE)|(ALTER)|(MODIFY))\s+((INDEX)|(TABLE)|(DATABASE)|(COLUMN)|(VIEW)))$").
RESULT = regexp:Replace(pInputText, "?").
DELETE OBJECT regexp.
RETURN RESULT.
END METHOD.
END CLASS.
What I have tried to get mixed casing work:
regexp = NEW Regex("/^rest of the regex$/ig").
And I wrote a procedure to run some tests:
BLOCK-LEVEL ON ERROR UNDO, THROW.
USING TOOLS.externals.TextSanitizer.
DEFINE VARIABLE textSanitizer AS TextSanitizer NO-UNDO.
textSanitizer = NEW TextSanitizer().
MESSAGE textSanitizer:SanitizeSql("SELECT * FROM") VIEW-AS ALERT-BOX.
RETURN.
The tests I have run so far:
SELECT * FROM
--> This returns ?
as expectedselect * from
--> This returns the original string, so mixed casing does not work, not even when I added /i
SELECT * FROM SELCT * FROM
--> This also returned ?
, while I would expect ? SELCT * FROM
SELECT * FROM SELECT * FROM
--> Returns ?
, what I would expect is ? ?
Could someone help me out here?
Upvotes: 1
Views: 1327
Reputation: 5578
The point here is to filter out some SQL statements because some people here think that is the best way to prevent SQL injection. (I know)
This time I solved it this way:
METHOD PUBLIC CHARACTER SanitizeSql( INPUT pInputText AS CHARACTER ):
DEFINE VARIABLE regexp AS Regex NO-UNDO.
regexp = NEW Regex("(?i)(INSERT\s+INTO.*?VALUES|SELECT.*?FROM|UPDATE.*?SET|DELETE\s+FROM|(CREATE|DROP|TRUNCATE|ALTER|MODIFY)\s+(INDEX|TABLE|DATABASE|COLUMN|VIEW))").
DO WHILE regexp:IsMatch(pInputText) = TRUE:
pInputText = regexp:Replace(pInputText, "?").
END.
DELETE OBJECT regexp.
RETURN pInputText.
END METHOD.
First as @Bohemian suggested I added (?i)
in front of the regex. That fixed the casing issue.
After that I removed the ^
and $
at the beginning and end of the regex string. That solved the issue with finding the beginning and end of a string.
Lastly I used IsMatch()
in a DO WHILE
loop to find all the instances and replace them.
Upvotes: 0
Reputation: 425208
First off, remove all brackets around words - they're useless noise (plus I feel like I'm back in a LISP tutorial).
If you're having trouble with case, try using the in-line flag (?i)
.
Your quantifiers are all greedy instead of reluctant: use .*?
instead of .*
.
Unless all you SWL fragments occupy the whole line, you should remove the end anchor $
.
Applying the above proposed changes, try this:
regexp = NEW Regex("(?i)^(INSERT\s+INTO.*?VALUES|SELECT.*?FROM|UPDATE.*?SET|DELETE\s+FROM|(CREATE|DROP|TRUNCATE|ALTER|MODIFY)\s+(INDEX|TABLE|DATABASE|COLUMN|VIEW))").
I have fixed the issues as asked, but won't help you with expanding the regex so it matches every possible SQL statement. That is too big a task.
Upvotes: 1