Reputation: 811
I am trying to write a regex that can pull a string value from a mysql string.
That is, if I have the following generated sql string and I want to be able to extract the first_name:
my_string = "SELECT * FROM users WHERE first_name = 'first name value'"
What I currently have appears to work for most cases:
result = /first_name = ['"](.*?)['"]/i.match my_string
However, the issue is when there is either a ' or " in the first_name, i.e.
result = "SELECT * FROM users WHERE first_name = 'first\"s name value'"
or
result = "SELECT * FROM users WHERE first_name = 'first\\'s name value'"
the returned result is only the value UP to the escaped character, so in these cases, the returned group would be "first". How can I fix it so that the entire first_name value gets returned?
Upvotes: 1
Views: 1576
Reputation: 627535
You seem to need to match strings inside single or double quotes and only match between the matching quotes.
Use the Ruby regex feature to use multiple named groups with the same name:
/first_name = (?:'(?<val>[^'\\]*(?:\\.[^'\\]*)*)'|"(?<val>[^"\\]*(?:\\.[^"\\]*)*"))/i
See the Rubular demo
The value in-between the quotes will be inside "val" group.
Here is an IDEONE Ruby demo:
my_string = "SELECT * FROM users WHERE first_name = 'first name value'"
my_string2 = "SELECT * FROM users WHERE first_name = 'first\"s name value'"
my_string3 = "SELECT * FROM users WHERE first_name = 'first\\'s name value'"
rx = /first_name = (?:'(?<val>[^'\\]*(?:\\.[^'\\]*)*)'|"(?<val>[^"\\]*(?:\\.[^"\\]*)*"))/i
puts rx.match my_string # => first_name = 'first name value'
puts rx.match my_string2 # => first_name = 'first"s name value'
puts rx.match my_string3 # => first_name = 'first\'s name value'
To get the "val" (demo):
rx.match(my_string)["val"] # => first name value
Since named groups were introduced since Ruby 1.9 and you need it to work in Ruby 1.8, use a character class restricted with a negative lookahead solution.
/first_name = (['"])((?:(?!\1)[^\\])*(?:\\.(?:(?!\1)[^\\])*)*)\1/i
See the Rubular demo
The (['"])
matches and captures into Group 1 a '
or "
. The (?:(?!\1)[^\\])*
matches 0+ characters other than \
(due to [^\\]
) and that is not "
or '
(due to (?!\1)
). The (?:\\.(?:(?!\1)[^\\])*)*)
matches 0+ sequences of an escape sequences (see \\.
) that is followed with 0+ characters other than '
, "
or \
. The \1
backreference matches the corresponding closing quote.
See another Ruby demo:
my_string = "SELECT * FROM users WHERE first_name = 'first name value'"
my_string2 = "SELECT * FROM users WHERE first_name = 'first\"s name value'"
my_string3 = "SELECT * FROM users WHERE first_name = 'first\\'s name value'"
rx = /first_name = (['"])((?:(?!\1)[^\\])*(?:\\.(?:(?!\1)[^\\])*)*)\1/i
puts rx.match my_string # => first_name = 'first name value'
puts rx.match(my_string)[2] # => first name value
puts rx.match my_string2 # => first_name = 'first"s name value'
puts rx.match(my_string2)[2] # => first"s name value
puts rx.match my_string3 # => first_name = 'first\'s name value'
puts rx.match(my_string3)[2] # => first\'s name value
Upvotes: 2
Reputation: 468
You could try this
/first_name = ['"](.*?)['"]\z/i
example here
Upvotes: 1
Reputation: 2162
I tested this out on Rubular and it seems to get the value that you're looking for. The only thing is that it also captures your escape chars which you could replace:
f_name_match = /first_name = \'(.+)\'/i.match(string).replace('\')
Upvotes: 0