Reputation: 53
Just getting into regex and I am trying to write a regex for a uk national insurance number example ab123456c.
I've currently got this which works
^[jJ]{2}[\-\s]{0,1}[0-9]{2}[\-\s]{0,1}[0-9]{2}[\-\s]{0,1}[0-9]{2}[\-\s]{0,1}[a-zA-Z]$
but I was wondering if there is a shorter version for exmaple
^[jJ]{2} [ [\-\s]{0,1}[0-9]{2} ]{3} [\-\s]{0,1}[a-zA-Z]$
So repeat the [-\s]{0,1}[0-9]{2} 3 by wrapping it in some sort of group [ * ]{3}
Upvotes: 1
Views: 65
Reputation: 513
If i got you right, your insurance numbers are always two letters, 6 numbers, and a final letter, A,B,C or D? Wouldn't it be the easiest way to try sth. like that
/\w{2}\d{6}[A-D]/
you catch 2 letters at first with \w{2}
, then you get 6 numbers with \d{6}
and you end with a letter from A to D by [A-D]
Or, if blanks are impontant, try this
/\w{2}\d\d \d\d \d\d [A-D]/
I dont think that shorten it much more would be possible, since when you are trying to use (\d\d ){3}
it would only repeat the same pattern three times, e.g. 23 23 23
If you really want to learn RegEx, i suggest you this tutorial, it helped me a lot in the beginning of Regular Expressions
Upvotes: 3
Reputation: 89547
A simple research for a regex tutorial in your favorite search engine (duckduckgo for sure) would give you the answer faster than asking in a forum!
So what you are looking for is a non-capturing group (?:...)
. You can rewrite your pattern like this:
^[jJ]{2}(?:[-\s]?[0-9]{2}){3}[-\s]?[a-zA-Z]$
or like this if you use a case insensitive flag/option:
^J{2}(?:[-\s]?[0-9]{2}){3}[-\s]?[A-Z]$
An other possible way consists to remove all that is not a letter or a digit before (and eventually to use an uppercase function). Then you only need:
^J{2}[0-9]{6}[A-Z]$
As an aside, I don't understand why you start your pattern with J for the first two letters, since many others letters are allowed according to this article: https://en.wikipedia.org/wiki/National_Insurance_number
Other thing, short and efficient are two different things in computing.
for example this pattern will be efficient too and more restrictive:
^(?!N[KT]|BG|GB|[KT]N|ZZ)[ABCEGHJ-PRSTW-Z][ABCEGHJ-NPRSTW-Z][0-9][0-9][-\s]?[0-9][0-9][-\s]?[0-9][0-9][-\s]?[A-D]$
Upvotes: 3
Reputation: 626738
A shorter version:
/^j{2}(?:[-\s]?\d{2}){3}[-\s]?[a-zA-Z]$/i
See the regex online demo
Note that
-
inside the character class if it is at the beginning or end of the class (see Metacharacters Inside Character Classes)\d
as a shorthand character class for a digit (see Shorthand Character Classes){0,1}
limiting quantifier can usually be represented as a ?
quantifier (1 or zero occurrences) (see Limiting Repetition)/i
(or inline modifier version (?i)
- depending on the engine) can be used to turn [jJ]
to just j
or J
(see Specifying Modes Inside The Regular Expression)(?:[-\s]?\d{2}){3}
(see Limiting Repetition)Upvotes: 2