Reputation: 19220
I'm using Ruby 2.4. I have some strings that contain characters that have special meaning in regular expression. So to eliminate any possibility of those characters being interpreted as regexp characters, I use the "Regexp.escape" to attempt to escape them. However, I still seem unable to make teh below regular expression work ...
2.4.0 :005 > tokens = ["a", "b?", "c"]
=> ["a", "b?", "c"]
2.4.0 :006 > line = "1\ta\tb?\tc\t3"
=> "1\ta\tb?\tc\t3"
2.4.0 :009 > /#{Regexp.escape(tokens.join(" ")).gsub(" ", "\\s+")}/.match(line)
=> nil
How can I properly escape the characters before substituting the space with a "\s+" expression, whcih I do want interpreted as a regexp character?
Upvotes: 2
Views: 167
Reputation: 627103
When the Regexp.escape(tokens.join(" ")).gsub(" ", "\\s+")
is executed, tokens.join(" ")
yields a b? c
, then the string is escaped -> a\ b\?\ c
, and then the gsub
is executed resulting in a\\s+b\?\\s+c
. Now, line
is 1 a b? c 3
. So, all \\
are now matching a literal backslash, they no longer form an special regex metacharacter matching whitespace.
You need to escape the tokens, and join with \s+
, or join with space and later replace the space with \s+
:
/#{tokens.map { |n| Regexp.escape(n) }.join("\\s+")}/.match(line)
OR
/#{tokens.map { |n| Regexp.escape(n) }.join(" ").gsub(" ", "\\s+")}/.match(line)
Upvotes: 2