Reputation: 51
I am trying to write a regex that will match Roman numerals from 0 to 39 only. There are plenty of examples which match much larger Roman numerals, but I cannot figure out how to match this specific subset.
Upvotes: 5
Views: 2706
Reputation: 36110
Assuming you know you have valid Roman numerals and want to fetch only the ones <= 39, that is easy:
^[XVI]*$
If that is not the case, it's a little bit trickier, but you can still take advantage of the fact that all the numbers that can be represented only with X
, V
and I
are 1..39:
^X{0,3}(?:V?I{0,3}|I[VX])$
X{0,3}
covers 10, 20, 30X{0,3}V?I{0,3}
covers all but the ones that end with 4 or 9 (14, 29, etc)X{0,3}I[VX]
exactly the ones ending with 4 or 9Note: these will also match an empty string, which is my interpretation of a Roman zero. If that is not the case, you can replace the *
with +
for the first regex and add a positive lookahead at the start of the regex for the second ((?=.)
).
Note 2: If they are not on separate lines (or in separate strings), you can replace ^
and $
with word boundaries (\b
).
Upvotes: 1
Reputation: 5564
Got it. Try this:
/^(X{1,3})(I[XV]|V?I{0,3})$|^(I[XV]|V?I{1,3})$|^V$/
Zero doesn't exist in Roman numerals. Therefore feel free to tack on your own implementation for zero.
Upvotes: 2
Reputation: 9117
I'm not sure how to represent 0 using Roman numerals. I assume that it has separate token N
(see Wikipedia).
Assuming the regex tries to match the whole string (like in Java) and you have lookahead, you can use this regex:
(?.)(X{0,3}(IX|IV|V?I{0,3})|N)
Explanation:
(?.)
: ensure at least one characterX{0,3}
: define the tens (0, 10, 20, 30)(...)
: define the final digitIX
: 9IV
: 4V?I{0,3}
: 0-3, 5-8 (0 not as whole number, require at least one X
)N
: 0 (as whole number)If you represent 0 as empty string, the regex is simpler:
X{0,3}(IX|IV|V?I{0,3})
since the lookahead and N
in the previous regex is just to prevent empty string.
Upvotes: 1