TotallyStuck
TotallyStuck

Reputation: 17

Regex Lookahead and Lookbehind to parse SQL statement

I am trying to parse SQL statements with regex and save it's parameters to use later.

Lets say I have this SQL statement:

INSERT INTO tablename (id, name, email) VALUES (@id, @name, @email)

The following regex will work just fine:

(@[0-9a-zA-Z$_]+)

However in this statement I should ignore everything in ' ' or " " and save only first parameter:

 UPDATE mytable SET id = @id, name = 'myname@id' WHERE id = 1;

According to this answer https://stackoverflow.com/a/307957 "it's not practical to do it in a single regular expression", but I am still trying to do this.

I tried to add Regex Lookahead and Lookbehind, but its not working:

(?<!\').*(@[0-9a-zA-Z$_]+).*(?!\')

Is there any way to do it using only one regular expression? Should I use lookahead/lookbehind or something else?

Upvotes: 0

Views: 917

Answers (2)

Nikolas
Nikolas

Reputation: 44486

You can simplify your regex. Note the group you want always to capture is followed with , or ). Being aware of this fact you get this regex:

(@[0-9a-zA-Z$_]+)(?=[,)])
  • @[0-9a-zA-Z$_]+ is your value
  • (?=[,)]) checks if the ) or , character follows.

If the way describing where your string can't be placed is complicated, better look where it must be places instead.

See how it works at Regex101.

Upvotes: 0

Taisbevalle
Taisbevalle

Reputation: 266

You can use: [\=\(\s]\s*\@[0-9+^a-zA-Z_0-9$_]+\s*[\),]

Explanation:

[\=\(\s] match a single character present in the list below

  • \= matches the character = literally
  • \( matches the character ( literally
  • \s match any white space character [\r\n\t\f ]

\s* match any white space character [\r\n\t\f ]

  • Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]

  • \@ matches the character @ literally [0-9+^a-zA-Z_0-9$_]+ match a single character present in the list below

  • Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]

  • 0-9 a single character in the range between 0 and 9
  • +^ a single character in the list +^ literally
  • a-z a single character in the range between a and z (case insensitive)
  • A-Z a single character in the range between A and Z (case insensitive)
  • _ the literal character _
  • 0-9 a single character in the range between 0 and 9
  • $_ a single character in the list $_ literally

\s* match any white space character [\r\n\t\f ]

  • Quantifier: * Between zero and unlimited times, as many times as possible, giving back as needed [greedy]

[\),] match a single character present in the list below

  • \) matches the character ) literally
  • , the literal character ,

Upvotes: 1

Related Questions