Reputation: 37470
In some cases, organizations are not permitted to use or store useful keys, such as SSN numbers, phone numbers, etc.
However, these unique keys are very useful for matching data. So, theoretically, if a data provider were able to provide you with a hashed value of the SSN, and you were to store that hash and use it for matching, you would never have to use or store the SSN.
What would be an appropriate hash function for something like a SSN?
Upvotes: 4
Views: 891
Reputation: 1
True, but anyway you can still use it to uniquely fingerprint something, that is the SSN number, relying on the second preimage resistance property of the cryptographic hash function. (as said above hashing them using a strong, slow hash algorithm, using a unique per-record prefix and suffix salt, because of the small size of the data)
Upvotes: 0
Reputation: 28316
You need to treat the SSN exactly like a password. Hash them using a strong, slow hash algorithm such as bcrypt or PBKDF2, using a unique per-record prefix and suffix salt.
The downside of hashing SSNs is that they're predictable, and have very little entropy, making a plaintext bruteforce quite easy. If you can afford it, I'd suggest investing in hardware protection (i.e. a HSM) for this kind of thing. In fact, you should avoid identifying people by their SSN entirely.
Upvotes: 1
Reputation: 21722
So, theoretically, if a data provider were able to provide you with a hashed value of the SSN, and you were to store that hash and use it for matching, you would never have to use or store the SSN.
That is false; hashes by design are not unique and cannot be used to uniquely identify anything. If you must uniquely identify something, and are not allowed to use someone else's identifier, you must come up with your own identifier. That is why things like gas cards, movie rental cards, et al. come with their own unique membership identifiers.
Upvotes: 0