Reputation: 7442
I have a custom EncryptedCharField, which I want to basically appear as a CharField when interfacing UI, but before storing/retrieving in the DB it encrypts/decrypts it.
The custom fields documentation says to:
__metaclass__ = models.SubfieldBase
So you think this would be easy enough - for 2. just decrypt the value, and 3. just encrypt it.
Based loosely on a django snippet, and the documentation this field looks like:
class EncryptedCharField(models.CharField):
"""Just like a char field, but encrypts the value before it enters the database, and decrypts it when it
retrieves it"""
__metaclass__ = models.SubfieldBase
def __init__(self, *args, **kwargs):
super(EncryptedCharField, self).__init__(*args, **kwargs)
cipher_type = kwargs.pop('cipher', 'AES')
self.encryptor = Encryptor(cipher_type)
def get_prep_value(self, value):
return encrypt_if_not_encrypted(value, self.encryptor)
def to_python(self, value):
return decrypt_if_not_decrypted(value, self.encryptor)
def encrypt_if_not_encrypted(value, encryptor):
if isinstance(value, EncryptedString):
return value
else:
encrypted = encryptor.encrypt(value)
return EncryptedString(encrypted)
def decrypt_if_not_decrypted(value, encryptor):
if isinstance(value, DecryptedString):
return value
else:
encrypted = encryptor.decrypt(value)
return DecryptedString(encrypted)
class EncryptedString(str):
pass
class DecryptedString(str):
pass
and the Encryptor looks like:
class Encryptor(object):
def __init__(self, cipher_type):
imp = __import__('Crypto.Cipher', globals(), locals(), [cipher_type], -1)
self.cipher = getattr(imp, cipher_type).new(settings.SECRET_KEY[:32])
def decrypt(self, value):
#values should always be encrypted no matter what!
#raise an error if tthings may have been tampered with
return self.cipher.decrypt(binascii.a2b_hex(str(value))).split('\0')[0]
def encrypt(self, value):
if value is not None and not isinstance(value, EncryptedString):
padding = self.cipher.block_size - len(value) % self.cipher.block_size
if padding and padding < self.cipher.block_size:
value += "\0" + ''.join([random.choice(string.printable) for index in range(padding-1)])
value = EncryptedString(binascii.b2a_hex(self.cipher.encrypt(value)))
return value
When saving a model, an error, Odd-length string, occurs, as a result of attempting to decrypt an already decrypted string. When debugging, it appears as to_python ends up being called twice, the first with the encrypted value, and the second time with the decrypted value, but not actually as a type Decrypted, but as a raw string, causing the error. Furthermore get_prep_value is never called.
What am I doing wrong?
This should not be that hard - does anyone else think this Django field code is very poorly written, especially when it comes to custom fields, and not that extensible? Simple overridable pre_save and post_fetch methods would easily solve this problem.
Upvotes: 18
Views: 24739
Reputation: 3760
I think the issue is that to_python is also called when you assign a value to your custom field (as part of validation may be, based on this link). So the problem is to distinguish between to_python calls in the following situations:
One hack you could use is to add prefix or suffix to the value string and check for that instead of doing isinstance check.
I was going to write an example, but I found this one (even better :)).
Check BaseEncryptedField: https://github.com/django-extensions/django-extensions/blob/2.2.9/django_extensions/db/fields/encrypted.py (link to an older version because the field was removed in 3.0.0; see Issue #1359 for reason of deprecation)
Source: Django Custom Field: Only run to_python() on values from DB?
Upvotes: 11
Reputation: 23532
Since this question was originally answered, a number of packages have been written to solve this exact problem.
For example, as of 2018, the package django-encrypted-model-fields handles this with a syntax like
from encrypted_model_fields.fields import EncryptedCharField
class MyModel(models.Model):
encrypted_char_field = EncryptedCharField(max_length=100)
...
As a rule of thumb, it's usually a bad idea to roll your own solution to a security challenge when a more mature solution exists out there -- the community is a better tester and maintainer than you are.
Upvotes: 2
Reputation: 1179
You need to add a to_python method that deals with a number of cases, including passing on an already decrypted value
(warning: snippet is cut from my own code - just for illustration)
def to_python(self, value):
if not value:
return
if isinstance(value, _Param): #THIS IS THE PASSING-ON CASE
return value
elif isinstance(value, unicode) and value.startswith('{'):
param_dict = str2dict(value)
else:
try:
param_dict = pickle.loads(str(value))
except:
raise TypeError('unable to process {}'.format(value))
param_dict['par_type'] = self.par_type
classname = '{}_{}'.format(self.par_type, param_dict['rule'])
return getattr(get_module(self.par_type), classname)(**param_dict)
By the way:
Instead of get_db_prep_value
you should use get_prep_value
(the former is for db specific conversions - see https://docs.djangoproject.com/en/1.4/howto/custom-model-fields/#converting-python-objects-to-query-values )
Upvotes: 1
Reputation: 17639
You should be overriding to_python
, like the snippet did.
If you take a look at the CharField
class you can see that it doesn't have a value_to_string
method:
The docs say that the to_python
method needs to deal with three things:
You are currently only dealing with the third case.
One way to handle this is to create a special class for a decrypted string:
class DecryptedString(str):
pass
Then you can detect this class and handle it in to_python()
:
def to_python(self, value):
if isinstance(value, DecryptedString):
return value
decrypted = self.encrypter.decrypt(encrypted)
return DecryptedString(decrypted)
This prevents you from decrypting more than once.
Upvotes: 4
Reputation: 599638
You forgot to set the metaclass:
class EncryptedCharField(models.CharField):
__metaclass__ = models.SubfieldBase
The custom fields documentation explains why this is necessary.
Upvotes: 3