MikeTWebb
MikeTWebb

Reputation: 9279

Regular Expression with delimiters and length

I have a tricky regular expression I need to implement and I'm not a great RegEx guy.

The rules are:

2 alphanumeric characters followed by a . or a - followed by 2 alphanumeric characters.

It cannot be empty and it cannot have only one pair (i.e. 01) The string can be up to 10 sets of 2 alpha numerics. i.e., 01.02.03.04.05.06.10 and the delimiter, once selected, cannot change. And the expression cannot end with a delimiter

Examples are:

Valid:

a1.02.b3.00
01-02-aa-04
01.02
aa.bb
ac.21

Invalid:

aa.01-02
123.2.10
01
a1.

Ideas?

Upvotes: 0

Views: 1166

Answers (5)

Filip Roséen
Filip Roséen

Reputation: 63872

EY, GUYS!?

why make things more complicated than they have to be?

^[a-z0-9]{2}([.-])([a-z0-9]{2}\1){0,8}[a-z0-9]{2}$

Depending on where you are using this regular expression you have a few options regarding making it match uppercase characters aswell.


If you are writing the regexp as /regular-expression/: Use /i as modifier (case-insensitive match).

If you are using regular expressions under .NET (as you have noted) you use the IgnoreCase option.


Explanation of the differents parts in this regex

  • ^[a-z0-9]{2} the string must start with two characters that matches \w (ie. [a-z0-9A-Z])
  • ([.-]) the next character must be either a dot or a hyphen, from now on \1 will contain this value
  • ([a-z0-9]{2}\1){0,8} we want zero to 8 pairs of 2 * alpha numeric chars + the first delimiter used.
  • [a-z0-9]{2} the string must end with two alpha numeric characters.

Upvotes: 3

David Brigada
David Brigada

Reputation: 594

I tried this:

^[[:alnum:]]{2}([-.])[[:alnum:]]{2}(?:\1[[:alnum:]]{2}){0,8}$

You need the anchors on both ends to make it match the whole string. Using [[:alnum:]] matches all alphanumerics, based on locale. If you want only the ones we consider in English, regardless of locale, you would want to use [A-Za-z0-9] in each case instead.

The trickiest part is the backreference, \1, which makes sure that you always use the same delimiter---it refers to the capturing parentheses in ([-.]). Thus, when you have 0-8 more repetitions of delimiter followed by 2 alphanumerics, the delimiter is always the same.

I tried this in Perl, and it passes a few test strings that I threw at it. Your mileage might vary if you're using a different language/library.

Upvotes: 1

Michael S.
Michael S.

Reputation: 1791

Possibly...

([\p{L}0-9]{2})(\.|-)([\p{L}0-9]{2})

This handles Unicode letters too, but I'm not sure that it is correct for your needs as your first two lines in the "Valid" set contain items that are <2 alpha-num><.><2 alpha-num><.><2 alpha-num><.><2 alpha-num> and not the format that you mention in the question where you are looking for <2 alpha-num><.><2 alpha-num>

Hope this helps.

Upvotes: 0

smendola
smendola

Reputation: 2311

It will be something like:

[a-z0-9]{2}\([.][a-z0-9]{2}\){1,9}|[a-z0-9]{2}\([-][a-z0-9]{2}\){1,9}

{2} means exactly 2

{1,9} means at least one, and up to 9

\(something\) is a grouping

a|b means match either a or b

Upvotes: 0

fge
fge

Reputation: 121810

This may work:

^[\w\d][\w\d](?:([.-])[\w\d][\w\d])(?:\1[\w\d][\w\d]){,8}

Upvotes: -1

Related Questions