Reputation: 558
I have patient health information stored in a SQL Server 2012 database. When I do a search on a patient's name, their names are encrypted, so the search is very slow. How can I add an index on an encrypted column ?
I am using Symmetric Key encryption (256-bit AES) on varbinary fields.
There are separate encrypted fields for Patient's first name, last name, address, phone number, DOB, SSN. All of these are searchable (partial also) except SSN.
Upvotes: 6
Views: 5114
Reputation: 418
If you are using Microsoft SQL server implicit encryptbykey function, there is no benefit of using index on that column because sql sever encryptbykey function will have different output every time for same input because of random iv used by sql server itself.
Upvotes: 1
Reputation: 28779
To build on the answer that @PhillipH provided: if you are performing an exact search on (say) last name you can include a computed column defined as CHECKSUM(encrypt(last_name))
(with encrypt
your encryption operation). This is secure in that it does not divulge any information -- a checksum on the encrypted value does not reveal anything about the plaintext.
Create an index on this computed column. To search on the name, instead of just doing WHERE encrypted_last_name = encrypt(last_name)
, add a search on the hash: WHERE encrypted_last_name = encrypt(last_name) AND CHECKSUM(encrypt(last_name)) = hashed_encrypted_last_name
. This is much faster because SQL Server only has to search an index for a small integer value, then verify that the name in fact matches, reducing the amount of data to check considerably. Note that no data is decrypted in this scheme, with or without the CHECKSUM
-- we search for the encrypted value only. The speedup does not come from reducing the amount of data that is encrypted/decrypted (only the data you pass in is encrypted) but the amount of data that needs to be indexed and compared for equality.
The only drawback is that this does not allow partial searches, or even case variation, and indeed, doing that securely is not trivial. Case is relatively simple (hash encrypted(TOUPPER(name))
, making sure you use a different key to avoid correlation), but partial matches require specialized indexes. The simplest approach I can think of is to use a separate service like Lucene to do the indexing, but make it use secure storage for its files (i.e. Encrypting File System (EFS) in Windows). Of course, that does mean a separate system that needs to be certified -- but I can't think of any convenient solution that remains entirely in SQL Server and does not require additional code.
If you can still change the database design/storage, you may wish to consider Transparent Data Encryption (TDE) which has the huge advantage that it's, well, transparent and integrated in SQL Server at the engine level. Not only should partial matching be much faster since individual rows don't need decrypting (just whole pages), if it's not fast enough you can create a full-text index which will also be encrypted. I don't know if TDE works with your security requirements, though.
Upvotes: 7
Reputation: 6222
As a programmatic solution, if you dont need a partial match, you could store a hash in the clear on another field and use the same hashing algorithm on the client/app server and match on hash. This would have the possibility of a false positive match but would negate the need to decrypt the data.
Upvotes: 2