Ludovic Kuty
Ludovic Kuty

Reputation: 4954

Ruby Regexp: difference between new and union with a single regexp

I have simplified the examples. Say I have a string containing the code for a regex. I would like the regex to match a literal dot and thus I want it to be:

\.

So I create the following Ruby string:

"\\."

However when I use it with Regexp.union to create my regex, I get this:

irb(main):017:0> Regexp.union("\\.")
=> /\\\./

That will match a slash followed by a dot, not just a single dot. Compare the previous result to this:

irb(main):018:0> Regexp.new("\\.")
=> /\./

which gives the Regexp I want but without the needed union.

Could you explain why Ruby acts like that and how to make the correct union of regexes ? The context of utilization is that of importing JSON strings describing regexes and union-ing them in Ruby.

Upvotes: 5

Views: 822

Answers (2)

molf
molf

Reputation: 74945

Passing a string to Regexp.union is designed to match that string literally. There is no need to escape it, Regexp.escape is already called internally.

Regexp.union(".")
#=> /\./

If you want to pass regular expressions to Regexp.union, don't use strings:

Regexp.union(Regexp.new("\\."))
#=> /\./

Upvotes: 5

Ben
Ben

Reputation: 13615

\\. is where you went wrong I think, if you want to match a . you should just use the first one \. Now you have a \ and \. and the first one is escaped.

To be safe just use the standard regex provided by Ruby which would be Regexp.new /\./ in your case

If you want to use union just use Regexp.union "." which should return /\./

From the ruby regex class:

Regexp.union("a+b*c")                #=> /a\+b\*c/

Upvotes: 0

Related Questions