J.Hendrix
J.Hendrix

Reputation: 2199

How can I validate US Social Security Number?

Anyone out there know how to improve this function? I'm not worried about shortening the code, I'm sure this could be done with better regex, I am more concerned about correct logic. I have had a terrible time finding documentation for SSN #'s. Most of the rules I use below have come from other programmers who work in the credit industry (no sources cited).

  1. Are there any additional rules that you are aware of?

  2. Do you know if any of this is wrong?

  3. Can you site your sources?

    public static bool isSSN(string ssn)
    {
        Regex rxBadSSN = new Regex(@"(\d)\1\1\1\1\1\1\1\1");
    
        //Must be 9 bytes
        if(ssn.Trim().Length != 9)
            return false;
    
        //Must be numeric
        if(!isNumeric(ssn))
            return false;
    
        //Must be less than 772999999
        if( (Int32)Double.Parse(ssn.Substring(0,3)) > 772 )
        {
            //Check for Green Card Temp SSN holders
            // Could be 900700000
            //          900800000
            if(ssn.Substring(0,1) != "9")
                return false;
    
            if(ssn.Substring(3,1) != "7" && ssn.Substring(3,1) != "8")
                return false;
        }
    
        //Obviously Fake!
        if(ssn == "123456789")
            return false;
    
        //Try again!
        if(ssn == "123121234")
            return false;
    
        //No single group can have all zeros
        if(ssn.Substring(0,3) == "000")
            return false;
        if(ssn.Substring(3,2) == "00")
            return false;
        if(ssn.Substring(5,4) == "0000")
            return false;
    
        //Check to make sure the SSN number is not repeating
        if (rxBadSSN.IsMatch(ssn))
            return false;
    
        return true;
    }
    

Upvotes: 38

Views: 72692

Answers (8)

GuyThreepwood64
GuyThreepwood64

Reputation: 21

I know this is an old question, but for the sake of others looking for answers, I figured I'd add a quick javascript function for checking that a given SSN is valid.

function checkSSN() {
    var inputSSN = #YourInput#,
        ssnRegex = new RegExp("^(9[0-9][0-9]|666|000|078051120|219099999|123456789|123121234|321214321)|^([0-8][0-9][0-9]00)|^([0-8][0-9][0-9][0-9][0-9]000)$"),
        repeats = /^(.)\1+$/;
        
    //make sure we have 2 dashes in the input Social Security number
    if( inputSSN.match(/./g).length === 2) {
        //Once we have confirmed that there are the right number of dashes, remove them, and make sure that the resulting string is a number (you may or may not need this logic depending on the format of your input SSN.
        inputSSN = inputSSN.replace(/-/g, "");

        if(!isNaN(inputSSN)) {
            //Test the input SSN against our regex to ensure that it doesn't contain any disqualifying combinations.
            if(!ssnRegex.test(inputSSN)) {
                //Make sure the input SSN isn't just a repeated number
                if(!repeats.test(inputSSN)) {
                    //If it lands inside of this, we know it's a valid option for a social security number.
                }
        }   
    }
}

For the ssnRegex logic:

The first section handles if the SSN starts with a number 900-999, 666, 000, or one of the known disqualifying SSNs mentioned above.

^(9[0-9][0-9]|666|000|078051120|219099999|123456789|123121234|321214321)

the second section ensures that the 2 digit portion isn't 00

^([0-8][0-9][0-9]00)

The third section ensures that the last portion isn't 0000

^([0-8][0-9][0-9][0-9][0-9]0000)

We additionally check to make sure they have inputted a number, and that they aren't just using a repeated number.

Upvotes: 1

Tank Liu
Tank Liu

Reputation: 1

In Hive, the SSN validation or ITIN validation is like:

select case when '078051120' rlike '^(?!1{9}|2{9}|3{9}|4{9}|5{9}|6{9}|7{9}|8{9}|9{9}|219099999|078051120|123456789)(?!666|000|9[0-9]{2})[0-9]{3}(?!00)[0-9]{2}(?!0{4})[0-9]{4}$' then 'SSN' 
     when '078051120' rlike '^9[0-9]{2}(7[0-9]|80|81|82|83|84|85|86|87|88)[0-9]{4}$' then 'ITIN' 
     else 'INVLD'
end as ssn_flg;

change the dummy '078051120' to column name of your table.

Upvotes: 0

Eric J.
Eric J.

Reputation: 150148

UPDATE

On June 25, 2011, the SSA changed the SSN assignment process to "SSN randomization".[27] SSN randomization affects the SSN assignment process in the following ways:

It eliminates the geographical significance of the first three digits of the SSN, previously referred to as the Area Number, by no longer allocating the Area Numbers for assignment to individuals in specific states. It eliminates the significance of the highest Group Number and, as a result, the High Group List is frozen in time and can be used for validation of SSNs issued prior to the randomization implementation date. Previously unassigned Area Numbers have been introduced for assignment excluding Area Numbers 000, 666 and 900–999.

New Rules

  • The Social Security number is a nine-digit number in the format "AAA-GG-SSSS". The number is divided into three parts.
  • The middle two digits are the Group Number. The Group Numbers range from 01 to 99.
  • The last four digits are Serial Numbers. They represent a straight numerical sequence of digits from 0001 to 9999 within the group.
  • Some special numbers are never allocated:
    • Numbers with all zeros in any digit group (000-##-####, ###-00-####, ###-##-0000).
    • Numbers with 666 or 900-999 (Individual Taxpayer Identification Number) in the first digit group.
  • SSNs used in advertising have rendered those numbers invalid.

http://en.wikipedia.org/wiki/Social_Security_number#Structure

Previous Answer

Here's the most-complete description of the makeup of an SSN that I have found.

Upvotes: 33

MaKR
MaKR

Reputation: 1892

Answer 5 years after initial question due to changes in validation rules by the Social Security Administration. Also there are Specific numbers to invalidate according to this link.

As per my near-2-year-old answer I also left out isNumeric(ssn) because the field is a numeric and already strips characters before calling the validate function.

// validate social security number with ssn parameter as string
function validateSSN(ssn) {
  // find area number (1st 3 digits, no longer actually signifies area)
  var area = parseInt(ssn.substring(0, 3));
  return (
    // 9 characters
    ssn.length === 9 &&
    // basic regex
    ssn.match(/^[0-8]{1}[0-9]{2}[0-9]{2}[0-9]{4}/) &&
    // disallow Satan's minions from becoming residents of the US
    area !== 666 &&
    // it's not triple nil
    area !== 0 &&
    // fun fact: some idiot boss put his secretary's ssn in wallets
    // he sold, now it "belongs" to 40000 people
    ssn !== '078051120' &&
    // was used in an ad by the Social Security Administration
    ssn !== '219099999'
  );
}

According to updated information there are no other checks to perform.

Upvotes: 15

Louis
Louis

Reputation: 181

As of 2011 SSN's are completely randomized (http://www.socialsecurity.gov/employer/randomization.html)

The only real rules left are:

  • Cannot start with 900-999 (although the Individual Taxpayer Identification Number, which can be used like an SSN by temporary residents and undocumented/DACA/DAPA immigrants in some situations, is in the same format and does start with 9)
  • Cannot start with 666
  • Cannot start with 000
  • Must be 9 numeric digits or 11 with the 2 dashes
  • Cannot be any of the known fakes;
    • "078051120" — Woolworth Wallet Fiasco
    • "219099999" — Was used in an ad by the Social Security Administration
  • Many people exclude repeating an sequential numbers as well, although these are now technically valid, and I feel sorry for the poor schmuck's who gets assigned these.

Upvotes: 18

Jesse Szypulski
Jesse Szypulski

Reputation: 118

Here is my PHP version

/**
 * Validate SSN - must be in format AAA-GG-SSSS or AAAGGSSSS
 *
 * @param $ssn
 * @return bool
 */
function validate_ssn($ssn) {

    $ssnTrimmed = trim($ssn);

    // Must be in format AAA-GG-SSSS or AAAGGSSSS
    if ( ! preg_match("/^([0-9]{9}|[0-9]{3}-[0-9]{2}-[0-9]{4})$/", $ssnTrimmed)) {
        return false;
    }

    // Split groups into an array
    $ssnFormatted = (strlen($ssnTrimmed) == 9) ? preg_replace("/^([0-9]{3})([0-9]{2})([0-9]{4})$/", "$1-$2-$3", $ssnTrimmed) : $ssnTrimmed;
    $ssn_array = explode('-', $ssnFormatted);

    // number groups must follow these rules:
    // * no single group can have all 0's
    // * first group cannot be 666, 900-999
    // * second group must be 01-99
    // * third group must be 0001-9999

    foreach ($ssn_array as $group) {
        if ($group == 0) {
            return false;
        }
    }

    if ($ssn_array[0] == 666 || $ssn_array[0] > 899) {
        return false;
    }

    return true;
}

Upvotes: 1

MaKR
MaKR

Reputation: 1892

This is obviously an old post, but I found some ways to shorten it. Also there are a few specific numbers to invalidate according to this link: http://www.snopes.com/business/taxes/woolworth.asp

Here's how I did it. I could have used regexes for repeating numbers, but with specific ones to invalidate we might as well add ones through fives to that list (over 5 will invalidate anyways due to area number validation). I also left out isNumeric(ssn) because the field is a numeric and already strips characters before calling the validate function.

function validateSSN(ssn) {
    // validate format (no all zeroes, length 9
    if (!ssn.match(/^[1-9][0-9]{2}[1-9][0-9]{1}[1-9][0-9]{3}/)
            || ssn.length!=9) return false;

    // validate area number (1st 3 digits)
    var area=parseInt(ssn.substring(0, 3));
    //  standard      railroad numbers (pre-1963)
    if (area>649 && !(area>=700 && area<=728)) return false;

    // disallow specific invalid number
    if (ssn=='078051120' || // fun fact: some idiot boss put his
                            // secretary's ssn in wallets he sold,
                            // now this is 40000 people's ssn
        ssn=='219099999' || // was used in an ad by the Social Security
                            // Administration
        ssn=='123456789' || // although valid it's not yet assigned and
                            // you're not likely to meet the person who
                            // will get it
        ssn=='123121234' || // probably is assigned to someone but more
                            // likely to find someone trying to fake a
                            // number (next is same)
        ssn=='321214321' || // all the rest are likely potentially
                            // valid, but most likely these numbers are
                            // abused
        ssn=='111111111' ||
        ssn=='222222222' ||
        ssn=='333333333' ||
        ssn=='444444444' ||
        ssn=='555555555') return false;

    return true;
}

Upvotes: 1

2kmaro
2kmaro

Reputation: 9

As of the randomizing of social security numbers post-911, the entries in the 900 series and even 666 are now potentially valid numbers.

The only certain things at this point in time appear to be:
the first group of 3 will never be 000
the middle group pair will never be 00
and the last four will never be 0000

You can perform some testing by first testing to ensure that the numeric value of the entry is >= 1010001 [and < 1000000000] (a ssan of 001-01-0001 appears to be the lowest legitimately assigned). Then you can proceed to check for 00 in positions 4 and 5, and 0000 in the last four.

Upvotes: 0

Related Questions