Reputation: 2199
Anyone out there know how to improve this function? I'm not worried about shortening the code, I'm sure this could be done with better regex, I am more concerned about correct logic. I have had a terrible time finding documentation for SSN #'s. Most of the rules I use below have come from other programmers who work in the credit industry (no sources cited).
Are there any additional rules that you are aware of?
Do you know if any of this is wrong?
Can you site your sources?
public static bool isSSN(string ssn)
{
Regex rxBadSSN = new Regex(@"(\d)\1\1\1\1\1\1\1\1");
//Must be 9 bytes
if(ssn.Trim().Length != 9)
return false;
//Must be numeric
if(!isNumeric(ssn))
return false;
//Must be less than 772999999
if( (Int32)Double.Parse(ssn.Substring(0,3)) > 772 )
{
//Check for Green Card Temp SSN holders
// Could be 900700000
// 900800000
if(ssn.Substring(0,1) != "9")
return false;
if(ssn.Substring(3,1) != "7" && ssn.Substring(3,1) != "8")
return false;
}
//Obviously Fake!
if(ssn == "123456789")
return false;
//Try again!
if(ssn == "123121234")
return false;
//No single group can have all zeros
if(ssn.Substring(0,3) == "000")
return false;
if(ssn.Substring(3,2) == "00")
return false;
if(ssn.Substring(5,4) == "0000")
return false;
//Check to make sure the SSN number is not repeating
if (rxBadSSN.IsMatch(ssn))
return false;
return true;
}
Upvotes: 38
Views: 72692
Reputation: 21
I know this is an old question, but for the sake of others looking for answers, I figured I'd add a quick javascript function for checking that a given SSN is valid.
function checkSSN() {
var inputSSN = #YourInput#,
ssnRegex = new RegExp("^(9[0-9][0-9]|666|000|078051120|219099999|123456789|123121234|321214321)|^([0-8][0-9][0-9]00)|^([0-8][0-9][0-9][0-9][0-9]000)$"),
repeats = /^(.)\1+$/;
//make sure we have 2 dashes in the input Social Security number
if( inputSSN.match(/./g).length === 2) {
//Once we have confirmed that there are the right number of dashes, remove them, and make sure that the resulting string is a number (you may or may not need this logic depending on the format of your input SSN.
inputSSN = inputSSN.replace(/-/g, "");
if(!isNaN(inputSSN)) {
//Test the input SSN against our regex to ensure that it doesn't contain any disqualifying combinations.
if(!ssnRegex.test(inputSSN)) {
//Make sure the input SSN isn't just a repeated number
if(!repeats.test(inputSSN)) {
//If it lands inside of this, we know it's a valid option for a social security number.
}
}
}
}
For the ssnRegex logic:
The first section handles if the SSN starts with a number 900-999, 666, 000, or one of the known disqualifying SSNs mentioned above.
^(9[0-9][0-9]|666|000|078051120|219099999|123456789|123121234|321214321)
the second section ensures that the 2 digit portion isn't 00
^([0-8][0-9][0-9]00)
The third section ensures that the last portion isn't 0000
^([0-8][0-9][0-9][0-9][0-9]0000)
We additionally check to make sure they have inputted a number, and that they aren't just using a repeated number.
Upvotes: 1
Reputation: 1
In Hive, the SSN validation or ITIN validation is like:
select case when '078051120' rlike '^(?!1{9}|2{9}|3{9}|4{9}|5{9}|6{9}|7{9}|8{9}|9{9}|219099999|078051120|123456789)(?!666|000|9[0-9]{2})[0-9]{3}(?!00)[0-9]{2}(?!0{4})[0-9]{4}$' then 'SSN'
when '078051120' rlike '^9[0-9]{2}(7[0-9]|80|81|82|83|84|85|86|87|88)[0-9]{4}$' then 'ITIN'
else 'INVLD'
end as ssn_flg;
change the dummy '078051120' to column name of your table.
Upvotes: 0
Reputation: 150148
UPDATE
On June 25, 2011, the SSA changed the SSN assignment process to "SSN randomization".[27] SSN randomization affects the SSN assignment process in the following ways:
It eliminates the geographical significance of the first three digits of the SSN, previously referred to as the Area Number, by no longer allocating the Area Numbers for assignment to individuals in specific states. It eliminates the significance of the highest Group Number and, as a result, the High Group List is frozen in time and can be used for validation of SSNs issued prior to the randomization implementation date. Previously unassigned Area Numbers have been introduced for assignment excluding Area Numbers 000, 666 and 900–999.
New Rules
http://en.wikipedia.org/wiki/Social_Security_number#Structure
Previous Answer
Here's the most-complete description of the makeup of an SSN that I have found.
Upvotes: 33
Reputation: 1892
Answer 5 years after initial question due to changes in validation rules by the Social Security Administration. Also there are Specific numbers to invalidate according to this link.
As per my near-2-year-old answer I also left out isNumeric(ssn) because the field is a numeric and already strips characters before calling the validate function.
// validate social security number with ssn parameter as string
function validateSSN(ssn) {
// find area number (1st 3 digits, no longer actually signifies area)
var area = parseInt(ssn.substring(0, 3));
return (
// 9 characters
ssn.length === 9 &&
// basic regex
ssn.match(/^[0-8]{1}[0-9]{2}[0-9]{2}[0-9]{4}/) &&
// disallow Satan's minions from becoming residents of the US
area !== 666 &&
// it's not triple nil
area !== 0 &&
// fun fact: some idiot boss put his secretary's ssn in wallets
// he sold, now it "belongs" to 40000 people
ssn !== '078051120' &&
// was used in an ad by the Social Security Administration
ssn !== '219099999'
);
}
According to updated information there are no other checks to perform.
Upvotes: 15
Reputation: 181
As of 2011 SSN's are completely randomized (http://www.socialsecurity.gov/employer/randomization.html)
The only real rules left are:
Upvotes: 18
Reputation: 118
Here is my PHP version
/**
* Validate SSN - must be in format AAA-GG-SSSS or AAAGGSSSS
*
* @param $ssn
* @return bool
*/
function validate_ssn($ssn) {
$ssnTrimmed = trim($ssn);
// Must be in format AAA-GG-SSSS or AAAGGSSSS
if ( ! preg_match("/^([0-9]{9}|[0-9]{3}-[0-9]{2}-[0-9]{4})$/", $ssnTrimmed)) {
return false;
}
// Split groups into an array
$ssnFormatted = (strlen($ssnTrimmed) == 9) ? preg_replace("/^([0-9]{3})([0-9]{2})([0-9]{4})$/", "$1-$2-$3", $ssnTrimmed) : $ssnTrimmed;
$ssn_array = explode('-', $ssnFormatted);
// number groups must follow these rules:
// * no single group can have all 0's
// * first group cannot be 666, 900-999
// * second group must be 01-99
// * third group must be 0001-9999
foreach ($ssn_array as $group) {
if ($group == 0) {
return false;
}
}
if ($ssn_array[0] == 666 || $ssn_array[0] > 899) {
return false;
}
return true;
}
Upvotes: 1
Reputation: 1892
This is obviously an old post, but I found some ways to shorten it. Also there are a few specific numbers to invalidate according to this link: http://www.snopes.com/business/taxes/woolworth.asp
Here's how I did it. I could have used regexes for repeating numbers, but with specific ones to invalidate we might as well add ones through fives to that list (over 5 will invalidate anyways due to area number validation). I also left out isNumeric(ssn) because the field is a numeric and already strips characters before calling the validate function.
function validateSSN(ssn) {
// validate format (no all zeroes, length 9
if (!ssn.match(/^[1-9][0-9]{2}[1-9][0-9]{1}[1-9][0-9]{3}/)
|| ssn.length!=9) return false;
// validate area number (1st 3 digits)
var area=parseInt(ssn.substring(0, 3));
// standard railroad numbers (pre-1963)
if (area>649 && !(area>=700 && area<=728)) return false;
// disallow specific invalid number
if (ssn=='078051120' || // fun fact: some idiot boss put his
// secretary's ssn in wallets he sold,
// now this is 40000 people's ssn
ssn=='219099999' || // was used in an ad by the Social Security
// Administration
ssn=='123456789' || // although valid it's not yet assigned and
// you're not likely to meet the person who
// will get it
ssn=='123121234' || // probably is assigned to someone but more
// likely to find someone trying to fake a
// number (next is same)
ssn=='321214321' || // all the rest are likely potentially
// valid, but most likely these numbers are
// abused
ssn=='111111111' ||
ssn=='222222222' ||
ssn=='333333333' ||
ssn=='444444444' ||
ssn=='555555555') return false;
return true;
}
Upvotes: 1
Reputation: 9
As of the randomizing of social security numbers post-911, the entries in the 900 series and even 666 are now potentially valid numbers.
The only certain things at this point in time appear to be:
the first group of 3 will never be 000
the middle group pair will never be 00
and the last four will never be 0000
You can perform some testing by first testing to ensure that the numeric value of the entry is >= 1010001 [and < 1000000000] (a ssan of 001-01-0001 appears to be the lowest legitimately assigned). Then you can proceed to check for 00 in positions 4 and 5, and 0000 in the last four.
Upvotes: 0