somecode
somecode

Reputation: 51

Regex Match Roman Numerals from 0-39 Only

I am trying to write a regex that will match Roman numerals from 0 to 39 only. There are plenty of examples which match much larger Roman numerals, but I cannot figure out how to match this specific subset.

Upvotes: 5

Views: 2706

Answers (3)

ndnenkov
ndnenkov

Reputation: 36110

Assuming you know you have valid Roman numerals and want to fetch only the ones <= 39, that is easy:

^[XVI]*$

See it in action


If that is not the case, it's a little bit trickier, but you can still take advantage of the fact that all the numbers that can be represented only with X, V and I are 1..39:

^X{0,3}(?:V?I{0,3}|I[VX])$

See it in action

  • X{0,3} covers 10, 20, 30
  • X{0,3}V?I{0,3} covers all but the ones that end with 4 or 9 (14, 29, etc)
  • X{0,3}I[VX] exactly the ones ending with 4 or 9

Note: these will also match an empty string, which is my interpretation of a Roman zero. If that is not the case, you can replace the * with + for the first regex and add a positive lookahead at the start of the regex for the second ((?=.)).

Note 2: If they are not on separate lines (or in separate strings), you can replace ^ and $ with word boundaries (\b).

Upvotes: 1

timolawl
timolawl

Reputation: 5564

Got it. Try this:

/^(X{1,3})(I[XV]|V?I{0,3})$|^(I[XV]|V?I{1,3})$|^V$/

Update:

Zero doesn't exist in Roman numerals. Therefore feel free to tack on your own implementation for zero.

Upvotes: 2

justhalf
justhalf

Reputation: 9117

I'm not sure how to represent 0 using Roman numerals. I assume that it has separate token N (see Wikipedia).

Assuming the regex tries to match the whole string (like in Java) and you have lookahead, you can use this regex:

(?.)(X{0,3}(IX|IV|V?I{0,3})|N)

Explanation:

  • (?.): ensure at least one character
  • X{0,3}: define the tens (0, 10, 20, 30)
  • (...): define the final digit
  • IX: 9
  • IV: 4
  • V?I{0,3}: 0-3, 5-8 (0 not as whole number, require at least one X)
  • N: 0 (as whole number)

If you represent 0 as empty string, the regex is simpler:

X{0,3}(IX|IV|V?I{0,3})

since the lookahead and N in the previous regex is just to prevent empty string.

Upvotes: 1

Related Questions