Phrogz
Phrogz

Reputation: 303178

Eval a string without string interpolation

AKA How do I find an unescaped character sequence with regex?

Given an environment set up with:

@secret = "OH NO!"
$secret = "OH NO!"
@@secret = "OH NO!"

and given string read in from a file that looks like this:

some_str = '"\"#{:NOT&&:very}\" bad. \u262E\n#@secret \\#$secret \\\\#@@secret"'

I want to evaluate this as a Ruby string, but without interpolation. Thus, the result should be:

puts safe_eval(some_str)
#=> "#{:NOT&&:very}" bad. ☮
#=> #@secret #$secret \#@@secret

By contrast, the eval-only solution produces

puts eval(some_str)
#=> "very" bad. ☮
#=> OH NO! #$secret \OH NO!

At first I tried:

def safe_eval(str)
  eval str.gsub(/#(?=[{@$])/,'\\#')
end

but this fails in the malicious middle case above, producing:

#=> "#{:NOT&&:very}" bad. ☮
#=> #@secret \OH NO! \#@@secret

Upvotes: 2

Views: 493

Answers (2)

Linuxios
Linuxios

Reputation: 35783

How about not using eval at all? As per this comment in chat, all that's necessary are escaping quotes, newlines, and unicode characters. Here's my solution:

ESCAPE_TABLE = {
  /\\n/ => "\n",
  /\\"/ => "\"",
}
def expand_escapes(str)
  str = str.dup
  ESCAPE_TABLE.each {|k, v| str.gsub!(k, v)}
  #Deal with Unicode
  str.gsub!(/\\u([0-9A-Z]{4})/) {|m| [m[2..5].hex].pack("U") }
  str
end

When called on your string the result is (in your variable environment):

"\"\"\#{:NOT&&:very}\" bad. ☮\n\#@secret \\\#$secret \\\\\#@@secret\""

Although I would have preferred not to have to treat unicode specially, it is the only way to do it without eval.

Upvotes: 1

Phrogz
Phrogz

Reputation: 303178

You can do this via regex by ensuring that there are an even number of backslashes before the character you want to escape:

def safe_eval(str)
  eval str.gsub( /([^\\](?:\\\\)*)#(?=[{@$])/, '\1\#' )
end

…which says:

  • Find a character that is not a backslash [^\\]
  • followed by two backslashes (?:\\\\)
    • repeated zero or more times *
  • followed by a literal # character
  • and ensure that after that you can see either a {, @, or $ character.
  • and replace that with
    • the non-backslash-maybe-followed-by-even-number-of-backslashes
    • and then a backslash and then a #

Upvotes: 1

Related Questions