Reputation: 9279
I have a tricky regular expression I need to implement and I'm not a great RegEx guy.
The rules are:
2 alphanumeric characters followed by a .
or a -
followed by 2 alphanumeric characters.
It cannot be empty and it cannot have only one pair (i.e. 01
) The string can be up to 10 sets of 2 alpha numerics. i.e., 01.02.03.04.05.06.10
and the delimiter, once selected, cannot change. And the expression cannot end with a delimiter
Examples are:
Valid:
a1.02.b3.00
01-02-aa-04
01.02
aa.bb
ac.21
Invalid:
aa.01-02
123.2.10
01
a1.
Ideas?
Upvotes: 0
Views: 1166
Reputation: 63872
why make things more complicated than they have to be?
^[a-z0-9]{2}([.-])([a-z0-9]{2}\1){0,8}[a-z0-9]{2}$
Depending on where you are using this regular expression you have a few options regarding making it match uppercase characters aswell.
If you are writing the regexp as /regular-expression/
: Use /i
as modifier (case-insensitive match).
If you are using regular expressions under .NET (as you have noted) you use the IgnoreCase
option.
Explanation of the differents parts in this regex
^[a-z0-9]{2}
the string must start with two characters that matches \w
(ie. [a-z0-9A-Z])([.-])
the next character must be either a dot or a hyphen, from now on \1
will contain this value([a-z0-9]{2}\1){0,8}
we want zero to 8 pairs of 2 * alpha numeric chars + the first delimiter used.[a-z0-9]{2}
the string must end with two alpha numeric characters.Upvotes: 3
Reputation: 594
I tried this:
^[[:alnum:]]{2}([-.])[[:alnum:]]{2}(?:\1[[:alnum:]]{2}){0,8}$
You need the anchors on both ends to make it match the whole string. Using [[:alnum:]]
matches all alphanumerics, based on locale. If you want only the ones we consider in English, regardless of locale, you would want to use [A-Za-z0-9]
in each case instead.
The trickiest part is the backreference, \1
, which makes sure that you always use the same delimiter---it refers to the capturing parentheses in ([-.])
. Thus, when you have 0-8 more repetitions of delimiter followed by 2 alphanumerics, the delimiter is always the same.
I tried this in Perl, and it passes a few test strings that I threw at it. Your mileage might vary if you're using a different language/library.
Upvotes: 1
Reputation: 1791
Possibly...
([\p{L}0-9]{2})(\.|-)([\p{L}0-9]{2})
This handles Unicode letters too, but I'm not sure that it is correct for your needs as your first two lines in the "Valid" set contain items that are <2 alpha-num><.><2 alpha-num><.><2 alpha-num><.><2 alpha-num> and not the format that you mention in the question where you are looking for <2 alpha-num><.><2 alpha-num>
Hope this helps.
Upvotes: 0
Reputation: 2311
It will be something like:
[a-z0-9]{2}\([.][a-z0-9]{2}\){1,9}|[a-z0-9]{2}\([-][a-z0-9]{2}\){1,9}
{2} means exactly 2
{1,9} means at least one, and up to 9
\(something\) is a grouping
a|b means match either a or b
Upvotes: 0
Reputation: 121810
This may work:
^[\w\d][\w\d](?:([.-])[\w\d][\w\d])(?:\1[\w\d][\w\d]){,8}
Upvotes: -1