Marten
Marten

Reputation: 3872

BadValueError when storing repeated KeyProperty with custom validator

I'm trying to simplify the JSON serialization and de-serialization of ndb KeyProperties by utilizing a custom validator that converts a string to a Key of the respective kind.

The idea is to have a property like this:

def key_validator(kind):
  def validator(prop, value):
    if not isinstance(value, ndb.Key):
      return ndb.Key(kind, value)
    return value
  return validator

class Bar(ndb.Model):

  foo = ndb.KeyProperty('Foo', validator=key_validator('Foo'))

As you can see the validator converts any string to a Key of the given kind. The goal is to be able to pass a JSON object that contains the id of a Key to the populate method like so:

bar = Bar()
bar.populate(json.loads('{"foo": "1234"}'))

Which should effectively do this:

bar = Bar()
bar.foo = ndb.Key("Foo", "1234")

The problem is that this requires to override KeyProperty because the validator is called after some basic validation is performed, which fails, because "1234" is apparently not a Key, see issue 268.

So to make this work I've created a "ValidationMixin" and a new KeyProperty that calls the validator before any other validation takes place (and also serializes the Key to just the id).

class ValidationMixin(object):
  # make sure to call _validator before we do as the very first validation step
  def _do_validate(self, value):
    if self._validator is not None:
      newvalue = self._validator(self, value)
      if newvalue is not None:
        value = newvalue
    return super(ValidationMixin, self)._do_validate(value)

# A KeyProperty that allows a validator to generate a Key.
# In addition it serializes to just the id of the key
class KeyProperty(ValidationMixin, ndb.KeyProperty):
  # return just the id of the key
  def _get_for_dict(self, entity):
    value = self._get_value(entity)
    if self._repeated:
      return [v.id() for v in value]
    elif value is not None:
      return value.id()
    return value

Using this KeyProperty works like a charm for non-repeated properties. Unfortunately it fails badly with properties that have repeated=True.

The following exception is thrown when I call bar.populate(json.loads('[{"foo": "1234"}]')) followed by put():

  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3451, in _put
    return self._put_async(**ctx_options).get_result()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
    self.check_success()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 427, in _help_tasklet_along
    value = gen.throw(exc.__class__, exc, tb)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/context.py", line 824, in put
    key = yield self._put_batcher.add(entity, options)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/tasklets.py", line 430, in _help_tasklet_along
    value = gen.send(val)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/context.py", line 358, in _put_tasklet
    keys = yield self._conn.async_put(options, datastore_entities)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1852, in async_put
    pbs = [entity_to_pb(entity) for entity in entities]
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 697, in entity_to_pb
    pb = ent._to_pb()
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 3167, in _to_pb
    prop._serialize(self, pb, projection=self._projection)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1422, in _serialize
    values = self._get_base_value_unwrapped_as_list(entity)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1192, in _get_base_value_unwrapped_as_list
    wrapped = self._get_base_value(entity)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1180, in _get_base_value
    return self._apply_to_values(entity, self._opt_call_to_base_type)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1355, in _apply_to_values
    newvalue = function(value)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1234, in _opt_call_to_base_type
    value = _BaseValue(self._call_to_base_type(value))
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1255, in _call_to_base_type
    return call(value)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 1331, in call
    newvalue = method(self, value)
  File "/base/data/home/runtimes/python27/python27_lib/versions/1/google/appengine/ext/ndb/model.py", line 2013, in _validate
  raise datastore_errors.BadValueError('Expected Key, got %r' % (value,))
BadValueError: Expected Key, got [Key('Foo', '486944fe896a44c689275e6f19e3084a')]

As you can see it complains about the value being a list instead of a single Key. Note that the Exception is thrown in put() not in populate, so the initial validation performed by _set_value succeeded.

So my question is, is my approach broken or should that work? If it should work, why doesn't it work and how can it be fixed?

update

According to the stack trace the code execution passes model.py, line 1355, which is strange, because the property is repeated and should the the other branch in model.py, line 1347

update 2

I just discovered that it works when I remove anther non-repeated KeyProperty from the model. It looks like the serialization is broken and the wrong KeyProperty instance is passed to the _seralize method

Upvotes: 2

Views: 438

Answers (1)

Marten
Marten

Reputation: 3872

Ok, found it. KeyProperty has this really weird constructor "signature magic" (model.py, line 1963).

The point is that if the first parameter is a string, it becomes the field name of the property, not the kind! If you want to specify the kind by string you must use a keyword argument, otherwise the kind parameter must be the actual type not just the name. Correct me if I'm wrong but that's not part of the public documentation. That's really confusing, because with a ndb.Key you actually can specify the kind as a string as the first positional parameter.

As it happened I had 3 KeyProperties with the same kind, but different attribute names. However, since I specified the kind as a string, it actually became the name. So all three properties were using the same name. As a result, the repeated property value was serialized with the non-repeated KeyProperty instance, causing this crash.

The solution was to specify the kind with a keyword argument:

foo = ndb.KeyProperty(kind='Foo', validator=key_validator('Foo'))

Serializing the KeyProperties from/to JSON works well now.

Upvotes: 1

Related Questions