Reputation: 17
I am trying to parse SQL statements with regex and save it's parameters to use later.
Lets say I have this SQL statement:
INSERT INTO tablename (id, name, email) VALUES (@id, @name, @email)
The following regex will work just fine:
(@[0-9a-zA-Z$_]+)
However in this statement I should ignore everything in ' ' or " " and save only first parameter:
UPDATE mytable SET id = @id, name = 'myname@id' WHERE id = 1;
According to this answer https://stackoverflow.com/a/307957 "it's not practical to do it in a single regular expression", but I am still trying to do this.
I tried to add Regex Lookahead and Lookbehind, but its not working:
(?<!\').*(@[0-9a-zA-Z$_]+).*(?!\')
Is there any way to do it using only one regular expression? Should I use lookahead/lookbehind or something else?
Upvotes: 0
Views: 917
Reputation: 44486
You can simplify your regex. Note the group you want always to capture is followed with ,
or )
. Being aware of this fact you get this regex:
(@[0-9a-zA-Z$_]+)(?=[,)])
@[0-9a-zA-Z$_]+
is your value(?=[,)])
checks if the )
or ,
character follows.If the way describing where your string can't be placed is complicated, better look where it must be places instead.
See how it works at Regex101.
Upvotes: 0
Reputation: 266
You can use: [\=\(\s]\s*\@[0-9+^a-zA-Z_0-9$_]+\s*[\),]
Explanation:
[\=\(\s]
match a single character present in the list below
\=
matches the character = literally\(
matches the character ( literally\s
match any white space character [\r\n\t\f ]
\s*
match any white space character [\r\n\t\f ]
Quantifier: *
Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\@
matches the character @
literally
[0-9+^a-zA-Z_0-9$_]+
match a single character present in the list below
Quantifier: +
Between one and unlimited times, as many times as possible, giving back as needed [greedy]
0-9
a single character in the range between 0 and 9+^
a single character in the list +^
literallya-z
a single character in the range between a and z (case insensitive)A-Z
a single character in the range between A and Z (case insensitive)_
the literal character _
0-9
a single character in the range between 0 and 9$_
a single character in the list $_
literally\s*
match any white space character [\r\n\t\f ]
*
Between zero and unlimited times, as many times as possible, giving back as needed [greedy][\),]
match a single character present in the list below
\)
matches the character )
literally,
the literal character ,
Upvotes: 1