user882134
user882134

Reputation: 309

Can I hash / encrypt a database TEXT column?

Apologies in advance for what may be a silly question, but I am working on building a little "journal" website, where users can type in daily thoughts in a private way. I'm currently storing this information in a MEDIUMTEXT datatype in a MySQL database.

My question is: is there a way to store this so that individuals (like myself) who have access to the database are not able to read the field to secure users privacy, similar to how I might hash a password?

Thanks in advance

Upvotes: 0

Views: 3269

Answers (1)

Philip Petrov
Philip Petrov

Reputation: 970

The better way is to encrypt data before the DBMS (with whatever programming language you are using on the website), store it already encrypted in MySQL/MariaDB, then read the encrypted data back and decrypt it again inside the software outside of the DBMS. That way your website software will do the encryption/decryption and MySQL/MariaDB will just store the data.

If you need to do it in SQL, here is a simple (not very secure!) way to do this by using the AES_ENCRYPT and AES_DECRYPT functions. First I will create a simple test table

CREATE TABLE encrypt(
    col MEDIUMBLOB
);

When I insert the data, I will use the encrypt function:

-- do this if you are unsure if the general log is enabled systemwide
SET sql_log_off = 'ON';

INSERT INTO encrypt(col) VALUES
(AES_ENCRYPT("blah","mysecretpassword"));

To read the data back, you must use AES_DECRYPT:

SELECT CONVERT(AES_DECRYPT(col, "mysecretpassword") USING utf8) AS col
FROM encrypt;
+------+
| col  |
+------+
| blah |
+------+

-- now you can turn the log back on if you need it 
-- and it is not disabled globally in my.cnf
SET sql_log_off = 'OFF';

Supplying wrong password will give a NULL value. Note that while it will probably work flawless, MEDIUMTEXT is not the best datatype - when you store the encrypted data it is all binary so BLOB is better (it does not care for collations, encodings, etc).

Why it is not recommended? First you should be careful for log files (like general log or log-slow-queries). If they are enabled, the DBMS may log your secret key in plaintext and it will be easy to recover for people who have administrative access to the machine. So if you are going to use that way, you must definitely disable logging! In the example above I showed how you can use the sql_log_off variable which will disable the plaintext general query log only for the current session (it will not disable it serverwide that way so other queries will be logged).

But that's not the whole story about logging - there is also a binary log for transactions. If enabled, it will log changes to data (like all INSERT, UPDATE and DELETE statements in particular). There is mysqlbinlog utility using which people with admin rights on the DBMS machine will be able to recover data with it... and they may eventually recover your secret key from the INSERT statement above as well. If you want to prevent the binlog too, you must do this before executing the INSERT statement:

 SET sql_log_bin = 'OFF';

And of course enable it back on after it. Note that it cannot happen inside a transaction - the bin log is needed for transaction management. Also this makes your database a bit crash unsafe. If the system crashes in the middle of the insert statement when your bin log is disabled, it may corrupt the data in the table. So final conclusion here - it is definitely better to do encryption/decryption in the application and do not do it in the DB. It will save you a lot of hassle with the logging issue.

Second the example above is not secure encryption because (by default) it is using the unsecure ECB mode. Briefly - all data is separated in blocks and each block is being encrypted the same way with the same key. That way equal plaintext blocks will result in the same encrypted blocks - this may leak patterns. Therefore it is better to use some block-chaining mode with initialization vector - it's a much stronger encryption. Unfortunately if you are using MariaDB, you should stop here as it does not support anything other than ECB yet (they are working to add other modes in future). If you are using MySQL, continue reading to improve the sql solution...

With MySQL you should consider switching the default encryption mode and start using the third IV (initialization vector) parameter of the encrypt/decrypt functions which must be 16 bytes (it should be a random value which is not a secret and it can be stored directly in the DB). Here is how:

  1. First change the block_encryption_mode system variable to something different than ECB (check here). You can use CBC for example.

  2. Then change the above queries like this:

    CREATE TABLE encrypt(
         col MEDIUMBLOB,
         iv BINARY(16)
    );
    
    -- Run this each time you encrypt/decrypt
    -- if you cannot guarantee that it is
    -- set properly in my.cnf
    SET block_encryption_mode="aes-256-cbc";
    
    SET sql_log_off = 'ON';
    
    INSERT INTO encrypt(iv, col)
    VALUES(RANDOM_BYTES(16), AES_ENCRYPT("blah", "mysecretpassword", iv));
    
    SELECT CONVERT(AES_DECRYPT(col, "mysecretpassword", iv) USING utf8) AS col
    FROM encrypt;
    
    SET sql_log_off = 'OFF';
    

Upvotes: 3

Related Questions