Reputation: 149
I want to create the below encrypt and decrypt UDF's in Redshift.
Library:
create library pyaes
language plpythonu
from 's3://aws_python/library/pyaes/pyaes.zip'
credentials 'aws-role'
region as 'aws-region';
Encrypt:
CREATE OR REPLACE FUNCTION test.aes_encrypt(input varchar(max))
RETURNS varchar(max) AS
' if input is None:
return None
import pyaes
key = ''abcdefghijklopoo''
aes = pyaes.AESModeOfOperationCTR(key)
encrypted_msg = aes.encrypt(input)
return encrypted_msg
'
LANGUAGE plpythonu STABLE;
Tried with below options as well:
encrypted_msg = aes.encrypt(input.encode("utf8"))
key = key.encode('utf-8')
Decrypt:
CREATE OR REPLACE FUNCTION test.aes_decrypt(encrypted_msg varchar(max))
RETURNS varchar(max) AS
'
if encrypted_msg is None or len(str(encrypted_msg)) == 0:
return None
import pyaes
key = ''abcdefghijklopoo''
aes = pyaes.AESModeOfOperationCTR (key)
decrypted_msg = aes.decrypt(encrypted_msg).decode("utf8")
return decrypted_msg
'
LANGUAGE plpythonu STABLE;
select aes_encrypt('Testing'); select aes_decrypt('');
But it's throwing below error:
Error: Invalid operation: String contains invalid or unsupported UTF8 codepoints. Bad UTF8 hex sequence: d5 fc (error 4);
Please advise. Thanks in advance.
Upvotes: 2
Views: 2428
Reputation: 51
I worked this through in a notebook where I could view the outputs. Basically an error occurs while the AES function is happening that you can't see the output of, and it passes a bad type back to Redshift because aes returns as a byte.
PLEASE NOTE: THIS IS MEANT FOR ENCRYPTED OBFUSCATION OF COLUMNS OF DATA and is not server side encryption. If security is critical please go encrypt your whole database. If you're going to use this to protect your customers data, please don't, please encrypt salt and hash everything etc etc. THIS IS your DISCLAIMER.
This needs to be converted to something redshift can deal with like hex, so use 'binascii.hexlify(cipher_txt)' to get the value back in a printable manner.
CREATE OR REPLACE FUNCTION aes_encrypt(input VARCHAR(20000))
RETURNS VARCHAR STABLE AS $$
import pyaes
import binascii
if input is None:
return None
key = 'abcdefghijklnosp'
aes=pyaes.AESModeOfOperationCTR(key)
cipher_txt=aes.encrypt(input)
cipher_txt2=binascii.hexlify(cipher_txt)
return str(cipher_txt2.decode('utf-8'))
$$ LANGUAGE plpythonu ;
Note the decode on return is redundant, and the cipher texts are split into lines to illustrate. I'm sure you can put them back into one line. like
cipher_txt=binascii.hexlify(aes.encrypt(input))
To unencrypt:
CREATE OR REPLACE FUNCTION aes_decrypt(encrypted_msg varchar(max))
RETURNS VARCHAR STABLE AS $$
import pyaes
import binascii
if encrypted_msg is None or len(str(encrypted_msg)) == 0:
return None
key = 'abcdefghijklnosp'
aes = pyaes.AESModeOfOperationCTR(key)
encrypted_msg2=binascii.unhexlify(encrypted_msg)
decrypted_msg2 = aes.decrypt(encrypted_msg2)
return str(decrypted_msg2.decode('utf-8'))
$$ LANGUAGE plpythonu ;
Also - side note - you don't need to specify length of the return varchar unless you need to for some other reason (unions or some-such) - because making everything max is a waste of space.
Upvotes: 2