Rubén Jiménez
Rubén Jiménez

Reputation: 1845

How do I creating regexps to detect integers and ranges?

I'm trying to solve problem #6 from the Ruby Quiz book. This problem says I have to create a new method called build() for Regex class in which, passing integers or ranges, it has to generate a regex to detect the allowed numbers.

For example:

lucky = Regexp.build(3, 7)
"7" =~ lucky # => true
"13" =~ lucky # => false
"3" =~ lucky # => true

month = Regexp.build(1..12)
"0" =~ month # => false
"1" =~ month # => true
"12" =~ month # => true

I developed a buggy version, but it doesn't work as expected. My problem is to generate the correct regex. All the patterns I tried in Rubular don't take what they should. For example, for Regexp.build(1, 3, 5) I got a pattern which looks like this one:

/^1|3|5$/

This works and it matches 1, 3 and 5. But it also matches 15 or 13.

What's the best way to get the numbers to not combine between them?

---- EDIT

Using groups, now it seems to work properly. Anyway, is there any way for getting regexp that represents a range? For example, keeping the previous example:

lucky = Regexp.build(1..12)
"7" =~ lucky # => true
"13" =~ lucky # => false
"0" =~ lucky # => false
"5" =~ lucky # => true

The regexp generated by Regexp.build would have to match all the values between 1 and 12, but no more. I have been searching around the web and i've seen it's complicated to generate this kind of regex programmatically. Is there any concrete or predefined method for this task?

http://utilitymill.com has a recursive function to accomplish that, but i consider it kinda complicated.

Upvotes: 0

Views: 113

Answers (4)

the Tin Man
the Tin Man

Reputation: 160551

The problem is your pattern is allowing in-word (in-number?) matches. I'd use this:

/\b(?:3|7)\b/

It's only allowing single letters of 3, or 7.

It is tested at: http://rubular.com/r/0rRUfXdlTJ

This pattern will work on a stand-alone value, or numbers embedded in a string.

\b are word-boundary markers, meaning there has to be a transition from a non-word to a word. A word is [a-zA-Z0-9_].

Using your tests:

"7" =~ /\b(?:3|7)\b/   # => 0
"13" =~ /\b(?:3|7)\b/  # => nil
"3" =~ /\b(?:3|7)\b/   # => 0

"0" =~ /\b(?:1|2|3|4|5|6|7|8|9|10|11|12)\b/   # => nil
"1" =~ /\b(?:1|2|3|4|5|6|7|8|9|10|11|12)\b/   # => 0
"12" =~ /\b(?:1|2|3|4|5|6|7|8|9|10|11|12)\b/  # => 0

Where => 0 means the pattern matched at the first character index, and nil was a miss.

That said, I wouldn't try to use a pattern to enforce a range, because that's trying to make them into something they're not really good for. Take a look at the patterns used to test for acceptable IPv4 numbers, or worse an IPv6 number. For real fun, look at the pattern for a valid email address. They all have specs defining what are valid values, but the patterns to define those are convoluted and beyond the ken of mortal men. Instead, use a pattern to locate the things that look like numbers, extract that value and test for whether that is in the acceptable range.

For instance, here's the IPv4 pattern from Ruby's Resolv::IPv4::Regex:

/\A((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\.((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\.((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\.((?x-mi:0
               |1(?:[0-9][0-9]?)?
               |2(?:[0-4][0-9]?|5[0-5]?|[6-9])?
               |[3-9][0-9]?))\z/

Longer values, like IPv6 make it even harder. See Regular expression that matches valid IPv6 addresses for more information. So, my recommendation is to use regular expressions for simple things, and take advantage of what they're good at -- extracting values that match a pattern, then use additional code to verify they're in range or are truly valid.

Upvotes: 0

Deepak Kumar Jha
Deepak Kumar Jha

Reputation: 482

/(^|\D)1(\D|$)|(^|\D)3(\D|$)|(^|\D)5(\D|$)/

this regex code matches 1 3 5 differently , it not matches 13 and 15.

If i misunderstand any thing then explain me in detail what you want.

thank you

Upvotes: 2

Bryan Ash
Bryan Ash

Reputation: 4479

You want to match a singular item that is at the beginning and end. This can be achieved by grouping with parentheses. For example: /^(1|3|5)$/

Upvotes: 0

Tim Pietzcker
Tim Pietzcker

Reputation: 336108

A small hint:

/^1|3|5$/

means

/^1/ or /3/ or /5$/

Look into groups. They will help you make sure that the scope of the alternation doesn't include your start/end anchors.

Upvotes: 1

Related Questions