Reputation: 649
I receive a regex as a string. Is it possible to know if the regex only permits number ? The regex I receive are mainly of the form :
But I may receive other regex.
Upvotes: 3
Views: 1121
Reputation: 4981
Like most people have pointed, this is a particularly complex task to achieve using a simple regex, since there are a lot of ways in which the same thing can be written, including cases where digits are hidden inside character classes, or negated character classes etc. Nonetheless, I gave it a shot and tested it out a bit, it works for basic scenarios.
The below regex matches any regex that matches only digits, and not any other characters. It may allow one or more digits, restrict only particular digits etc. which doesn't really matter. The capturing regex only ensures that the matched regex doesn't match any non-numbers.
\d
, [0-9]
, \p{N}
, [123]
and even literals 4
but not negated character classes [^\WA-Za-z_]
or [.-:]
*
, +
, ?
and even {x,y}
. Also works with non-greedy and possessive quantifiers i.e \d*?
and \d*+
|
such as \d?|[34]?|123
Limitations:
(..)
capturing group or (:..)
non-capturing group will fail even though they might by digit only[^\WA-Za-z_]
matches only digits, but it won't work.Regex:
^\^?((\(\?\<[=!][^\(\)]*?\))?(\[\d*(?:\d-\d)?\d*\]|\\d|\\p\{N\}|\d+(?:\|\d+)*)(\*|\+|\?|\{\d*,?\d*\})?(\?|\+)?(\(\?[=!][^\(\)]*?\))?)+(?:\|(?:(?:\(\?\<[=!][^\(\)]*?\))?(\[\d*(\d-\d)?\d*\]|\\d|\\p\{N\}|\d+(\|\d+)*)(\*|\+|\?|\{\d*,?\d*\})?(\?|\+)?(\(\?[=!][^\(\)]*?\))?))*\$?$
An easier way to visualise the solution is:
^(lookbehind)?(digit_classes)+(quantifier)?(quantifier_type)?(lookahead)?
lookbehind = (?<=.. or (?<!..
digit_classes = \d or [0-9] or \p{N} etc.
quantifier = * or + or ? or {,}
quantifier_type = ? or +
lookahead = (?=.. or (?!..
// Repeat the above to support 'OR' i.e |
((\(\?\<[=!][^\(\)]*?\))?(\[\d*(?:\d-\d)?\d*\]|\\d|\\p\{N\}|\d+(?:\|\d+)*)(\*|\+|\?|\{\d*,?\d*\})?(\?|\+)?(\(\?[=!][^\(\)]*?\))?)+
or the first capturing group includes support for all types of digits described in detail below.
(\(\?\<[=!][^\(\)]*?\))?
includes matching positive or negative look behinds
\(\?\<
includes the start of a look behind i.e (?<
followed by [=!]
since it could be positive or negative[^\(\)]*?
non-greedily allows any character other than (
or )
to be present in the lookbehind(\[\d*(?:\d-\d)?\d*\]|\\d|\\p\{N\}|\d+(?:\|\d+)*)
includes matching various digit representations such as \d
or [0-9]
or \p{N}
[\d*(?:\d-\d)?\d*\]
matches [0-9]
or [1234]
or even [1-3567]
\\d
matches \d
directly\\p\{N\}
matches \p{N}
directly\d+(?:\|\d+)*
allows literals to be present eg. '4' and support multiple literals too, such as 4|6|8
(\*|\+|\?|\{\d*,?\d*\})?
includes matching all quantifiers i.e *
, +
, ?
, {,}
.
\*|\+|\?
represents all the basic quantifiers \{\d*,?\d*\}
supports quantifiers specifying minimum and maximum counts such as \d{5,}
or [0-9]{3,6}
etc.(\?|\+)?
allows support for marking type of quantifier, such as lazy i.e \d*?
or possessive i.e \d*+
(\(\?[=!][^\(\)]*?\))?
allows positive or negative lookaheadsAfter this the first capturing group is repeated once more to support using |
between multiple digit representations i.e say the above groups are represented by (..)*
so to include support for |
, it is duplicated likes this (..)+(\|(..))*
to come up with the final regex.
Works for:
^[0-9]{6}$
^[0-9]+$
^[0-9]{5,10}$
\d[0][3-9]*?\d[0-7]*?$
\d*|[0-9]+|123
\d+(?!\s)
(?<=\w)[0-9]
Doesn't work for (but should work):
(\d)* # Capturing groups don't work
(?:\d+) # Non-capturing groups don't work
^[^\WA-Za-z_] # Negated character classes don't work
Note: All groups are capturing groups so that visualising them is easier. They can all be converted to non-capturing anytime.
Upvotes: 1
Reputation: 4039
^(\d|(?<!\^)\d-\d|\\d|\^|\$|\[|\]|{\d+(,\d+)?}|\+|\*|\\b|\\B|\\\d|\(\?[:=!<][^]+\)|\?|\||\((\d|(?<!\^)\d-\d|\\d|\^|\$|\[|\]|{\d+(,\d+)?}|\+|\*|\\b|\\B|\\\d|\(\?[:=!<][^]+\)|\?|\|)+\))+$
I know...I know
This only matches stuff, that could be in a regular expression, that matches digits. This includes (?=My phone number is: )[\d-]+
, which matches 123-4567-890
in My phone number is: 123-4567-890
.
To test whether a RegEx only matches digits, try matching it with this. If it matches anything, then it's okay.
This doesn't catch invalid ones, e.g. \d^\d$\d
If you notice any errors in it, then please let me know, and I'll correct it.
Upvotes: 0