Ishan
Ishan

Reputation: 4028

Regular Expression to validate UK National Insurance Number

I have the following regular expression which validates the British National Insurance Number

^([a-zA-Z]){2}( )?([0-9]){2}( )?([0-9]){2}( )?([0-9]){2}( )?([a-zA-Z]){1}?$

Accepted values are:

AB 12 34 56 A

AB123456A

I want this values also to be accepted. can anyone please help me sort out this?

AB 123456 A

AB 123 456 A

AB 1 2345 6 A

    AB   12   34   56 A    (multiple space anywhere)

The RE should work even if there are extra or no spaces in the string. Is this possible to do in RE? Thank you in advance.

Upvotes: 25

Views: 38558

Answers (9)

alan
alan

Reputation: 4842

Edit: Andrew Bauer modified my answer to add checks for allowed/disallowed characters that were unknown at the time I answered. You should up-vote his answer since it is more complete and apparently performs better validation.


If you can't just remove all the whitespace first, this should work:

^\s*[a-zA-Z]{2}(?:\s*\d\s*){6}[a-zA-Z]?\s*$

Explanation:

^                 # beginning of string
\s*               # optional leading whitespace
[a-zA-Z]{2}       # match two letters
(?:\s*\d\s*){6}   # six digits, with optional whitespace leading/trailing
[a-zA-Z]?         # zero or one letter
\s*               # optional trailing whitespace (just in case)
$                 # end of string

Upvotes: 38

EnviableOne
EnviableOne

Reputation: 43

Had a go at cleaning it up abit, and you can get full validation with this:

/^(?!BG|GB|NK|KN|TN|NT|ZZ)[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z](?:\s?\d){6}\s?[A-D]$/i

/^(?!BG|GB|NK|KN|TN|NT|ZZ)(?![DFIQU])[A-Z](?![DFIOQU])[A-Z](?:\s?\d){6}\s?[A-D]$/i

basically broken down

/                            //start of match
^                            //From start of string
(?!BG|GB|NK|KN|TN|NT|ZZ)     //exclude any starting with any of these combinations
[A-CEGHJ-PR-TW-Z]            //first letter not DFIQU
or (?![DFIQU])[A-Z])        
[A-CEGHJ-NPR-TW-Z]           //second letter not DFIOQU
or (?!=[DFIOQU])[A-Z])
(?:\s?\d){6}                 //six digits (0-9) optionally preceded by a space (non-caputuring)
\s?[A-D]                     //last letter A,B,C or D optionally preceded by a space
/                            //end of match 
i                            //options i - Case Insensitive

According to Regex101 it does it in 34 or 38 steps with PCRE2, the second one is the slower, but is easier to read.

both cater for spaces between the groups of letters and the numbers

both allow upper or lower case

Upvotes: 1

amittn
amittn

Reputation: 2355

  • Just found this when I was digging around to find a proper regex. So just thought of sharing this DWP NINO format-validation.
  • Under the hood, they are using the following regex "(^(?!BG)(?!GB)(?!NK)(?!KN)(?!TN)(?!NT)(?!ZZ)[A-Z&&[^DFIQUV]][A-Z&&[^DFIOQUV]][0-9]{6}[A-D ]$)"
  • I found this very useful and works as expected older version of maven also supports NINO with ending character as space.
  • List item

Nino Validator meets all the following criteria.

  • Must be 9 characters.
  • First 2 characters must be alpha.
  • Next 6 characters must be numeric.
  • Final character can be A, B, C, D or space.
  • First character must not be D,F,I,Q,U or V.
  • Second characters must not be D, F, I, O, Q, U or V.
  • First 2 characters must not be combinations of GB, NK, TN or ZZ (the term combinations covers both GB and BG etc.)

Upvotes: 0

Amey P Naik
Amey P Naik

Reputation: 718

There is a fixed format for National Insurance Number or NINO.

The format of the number is two prefix letters, six digits and one suffix letter.

Neither of the first two letters can be D, F, I, Q, U or V. The second letter also cannot be O. The prefixes BG, GB, NK, KN, TN, NT and ZZ are not allocated.

After the two prefix letters, the six digits are issued sequentially from 00 00 00 to 99 99 99. The last two digits determine the day of the week on which various social security benefits are payable and when unemployed claimants need to attend their Jobcentre to sign on.

The suffix letter is either A, B, C, or D.

The example used is typically AB123456C. Often, the number is printed with spaces to pair off the digits, like this: AB 12 34 56 C.

So, the regex would be,

[A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}\s?[0-9]{2}\s?[0-9]{2}\s?[0-9]{2}\s?[A-DFMP ]

Upvotes: 1

Edev
Edev

Reputation: 11

[RegularExpression(@"^([ACEHJLMOPRSWXYacehjlmoprswxy][A-CEGHJ-NPR-TW-Za-ceghj-npr-tw-z]|Bb[A-CEHJ-NPR-TW-Za-cehj-npr-tw-z]|Gg[ACEGHJ-NPR-TW-Zaceghj-npr-tw-z]|[KTkt][A-CEGHJ-MPR-TW-Za-ceghj-mpr-tw-z]|Nn[A-CEGHJL-NPR-SW-Za-ceghjl-npr-sw-z]|Zz[A-CEGHJ-NPR-TW-Ya-ceghj-npr-tw-y])[0-9]{6}[A-Da-d ]$", ErrorMessage = "NI Number must be in the correct format.")]
  • N.B/ This will allow users to enter in lower case values, as you may have a style on your textbox to transform lower case to upper case in which case you may want your regex for NI Number to allow lower case aswell.

Upvotes: 0

rg246
rg246

Reputation: 56

I found a link to the government xml document which contains the regular expression for validating national insurance which was:

[A-CEGHJ-NOPR-TW-Z]{2}[0-9]{6}[ABCD\s]{1}

I've done some testing on regex online and seems to work well and in only 4 steps.

https://web.archive.org/web/20121026141031/http://webarchive.nationalarchives.gov.uk/+/http://www.cabinetoffice.gov.uk/media/291296/CitizenIdentificationTypes-v1-4.xml

Upvotes: 2

Phil Cook
Phil Cook

Reputation: 2065

After reading all the answers here I have determined that there isn't a clear answer to this question.

With my regex you will need to strip all spaces out of the string but really you should be doing this anyway for validating most data. This can be achieved easily here are a couple of examples.

PHP

preg_replace('/(\s+)|(-)/', '', $str)

Javascript

str.replace(/ /g,'')

For the validation based off the UK government advice (http://www.hmrc.gov.uk/manuals/nimmanual/nim39110.htm) I constructed the following regex.

/^[A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}[0-9]{6}[A-D]{1}$/i

Here is an explanation of this regex

/                      # Wraps regex
^                      # Beginning of string
[A-CEGHJ-PR-TW-Z]{1}   # Match first letter, cannot be D, F, I, Q, U or V
[A-CEGHJ-NPR-TW-Z]{1}  # Match second letter, cannot be D, F, I, O, Q, U or V
[0-9]{6}               # Six digits
[A-D]{1}               # Match last letter can only be A, B, C or D
$                      # End of string
/i                     # Ending wrapper and i denotes can be upper or lower case

Here are some test patterns you can use

Pass

AA 11 22 33 A
BB 44 55 66 B
ZZ 67 89 00 C

Fail

AA 11 22 33 E
DA 11 22 33 A
FA 11 22 33 A
AO 11 22 33 A

As I needed to extend jQuery validate to add this new national insurance number regex I am also including this as it may be useful for someone.

jQuery.validator.addMethod('nino', function(nino, element) {
            return this.optional(element) || nino.length >= 9 &&
                nino.replace(/ /g,'').match(/^[A-CEGHJ-PR-TW-Z]{1}[A-CEGHJ-NPR-TW-Z]{1}[0-9]{6}[A-D]{1}$/i);
        }, 'Please specify a valid national insurance number');

Upvotes: 7

Andrew Bauer
Andrew Bauer

Reputation: 721

Actually, NIN doesn't allow D, F, I, Q, U or V for the first two letters and doesn't allow O for the second letter; on top of this, the prefix letters can not be BG, GB, NK, KN, TN, NT and ZZ. Also the suffix letter is either A, B, C or D, but may be represented by a space if unknown. - http://en.wikipedia.org/wiki/National_Insurance_number#Format

As such, a more valid check would be (I have only supplied a version with capital letters, can easily be altered for lower-case):

^(?!BG)(?!GB)(?!NK)(?!KN)(?!TN)(?!NT)(?!ZZ)(?:[A-CEGHJ-PR-TW-Z][A-CEGHJ-NPR-TW-Z])(?:\s*\d\s*){6}([A-D]|\s)$

Upvotes: 72

Artem Koshelev
Artem Koshelev

Reputation: 10607

I think it is better to normalize it first (remove all white-space characters: \s regex to replace with empty string), then validate.

^[a-zA-Z]{2}[0-9]{6}[a-zA-Z]{1}$

Upvotes: 5

Related Questions