Reputation: 3446
I am looking to match a regex with either 2 [0-9] repeats (and then some other pattern)
[0-9]{2}[A-z]{4}
OR 6 [0-9] repeats (and then some other pattern)
[0-9]{6}[A-z]{4}
The following is too inclusive:
[0-9]{2,6}[A-z]{4}
QUESTION
Is there a way that I can specify either 2 or 6 repeats?
Upvotes: 1
Views: 468
Reputation: 71578
The classic way would be:
(?:[0-9]{2}|[0-9]{6})[A-z]{4}
[Literally as [0-9]{2}
OR [0-9]{6}
]
But you can also use this one, which should be a little more efficient than the above with less potential backtracking:
[0-9]{2}(?:[0-9]{4})?[A-z]{4}
[Here, [0-9]{2}
then potential other 4 [0-9]
which makes a total of 6 [0-9]
in the required conditions]
You might not be aware that [A-z]
matches letters and some other characters, but it actually does.
The range [A-z]
effectively is equivalent to:
[A-Z\[\\\]^_`a-z]
Notice that the additional characters that match are:
[ \ ] ^ _ `
[spaces included voluntarily for separation but is not part of the characters]
This is because those characters are between the block letters and lowercase letters in the unicode table.
Upvotes: 3
Reputation: 22064
This should work
(?:[0-9]{2}|[0-9]{6})[a-zA-Z]{4}
Do you have some test cases I can verify it with.
However, if you don't anchor the start of the regex to a word (\b) or line boundary (^), the 1234asdf
will have 34asdf
as a partial match.
So either
\b(?:[0-9]{2}|[0-9]{6})[a-zA-Z]{4}
or
^(?:[0-9]{2}|[0-9]{6})[a-zA-Z]{4}
As a quick rundown of the regex changes
(?: )
creates a non capturing group|
selects between the alteratives [0-9]{2} and [0-9]{6}^
matches the start of a line$
matches the end of a line\b
matches a word boundary[a-zA-Z]
is being used instead of [A-z]
as it's likely what was intended (all alpha characters, regardless of case)You can also replace your [0-9]
s with a \d
which is shorthand for any digit. The best way I can think of to right this, and not get partial matches is as follows
(?:\b|^)(?:\d{2}|\d{6})[a-zA-Z]{4}(?:\b|$)
Upvotes: 3
Reputation: 24078
You can use the or |
like this within a non-capturing group:
(?:[0-9]{2}|[0-9]{6})[A-z]{4}
Be aware that using [A-z]
doesn't only include lower and upper case letters, but also [
, \
, ]
, ^
, _
, and '
which lie between Z
and a
in the ASCII code points. Use [A-Za-z]
for letters, as pointed out by @AlanMoore in his comment.
Upvotes: 5