Dave
Dave

Reputation: 19220

Why is RegExp.escape not working in my Ruby expression?

I'm using Ruby 2.4. I have some strings that contain characters that have special meaning in regular expression. So to eliminate any possibility of those characters being interpreted as regexp characters, I use the "Regexp.escape" to attempt to escape them. However, I still seem unable to make teh below regular expression work ...

2.4.0 :005 >   tokens = ["a", "b?", "c"]
 => ["a", "b?", "c"] 
2.4.0 :006 > line = "1\ta\tb?\tc\t3"
 => "1\ta\tb?\tc\t3" 
2.4.0 :009 > /#{Regexp.escape(tokens.join(" ")).gsub(" ", "\\s+")}/.match(line)
 => nil 

How can I properly escape the characters before substituting the space with a "\s+" expression, whcih I do want interpreted as a regexp character?

Upvotes: 2

Views: 167

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627103

When the Regexp.escape(tokens.join(" ")).gsub(" ", "\\s+") is executed, tokens.join(" ") yields a b? c, then the string is escaped -> a\ b\?\ c, and then the gsub is executed resulting in a\\s+b\?\\s+c. Now, line is 1 a b? c 3. So, all \\ are now matching a literal backslash, they no longer form an special regex metacharacter matching whitespace.

You need to escape the tokens, and join with \s+, or join with space and later replace the space with \s+:

/#{tokens.map { |n| Regexp.escape(n) }.join("\\s+")}/.match(line)

OR

/#{tokens.map { |n| Regexp.escape(n) }.join(" ").gsub(" ", "\\s+")}/.match(line)

Upvotes: 2

Related Questions