HorseloverFat
HorseloverFat

Reputation: 3336

db.StringProperty but property is occasionally datastore_types.Text?

Are there any circumstances under which GAE datastore might change a property type from StringProperty to TextProperty (effectively ignoring the model definition)?

Consider the following situation (simplified):

class MyModel(db.Model):
    user_input = db.StringProperty(default='', multiline=True)

Some entity instances of this model in my datastore have a datatype of TextProperty for 'user_input' (rather than simple str) and are therefore not indexed. I can fix this by retrieving the entity, setting model_instance.user_input = str(model_instance.user_input) and then putting the entity back into the datastore.

What I don't understand is how this is happening to only some entities, when there have been no changes to this model. Is it possible that the'type' of a db.model's property can be overridden from StringProperty to TextProperty?

Upvotes: 3

Views: 666

Answers (4)

max
max

Reputation: 29983

At least in ndb there is a special code path for StringProperties containing Unicode if the property is not indexed. In this case it is changed into a Text Property: The relevant snippet is:

if isinstance(value, str):
  v.set_stringvalue(value)
elif isinstance(value, unicode):
  v.set_stringvalue(value.encode('utf8'))
  if not self._indexed:
    p.set_meaning(entity_pb.Property.TEXT)

I'm unable to observe this on the Python side (see db.StringProperty but property is occasionally datastore_types.Text?) but I see it in the datastore and it caused headaches when loading the Datastore into BigQuery.

So use only:

  • StringProperty(indexed=True)
  • TextProperty()

Avoid StringProperty(indexed=False) If you might write mixed unicode and str values to the property - as it tends to happen with external data.

Upvotes: 1

HorseloverFat
HorseloverFat

Reputation: 3336

Turns out there IS a way the type of a property from the model's definition can be bypassed (and this was our case)! The following self contained example runs in the interactive console and demonstrates the bug. Essentially, setattr bypasses the property's type if you assign an existing property of a different type:

from google.appengine.ext import db
class TestModel(db.Model):
    string_type = db.StringProperty(default='', multiline=True)
    text_type = db.TextProperty()

some_instance = TestModel()
some_instance.text_type = 'foobar'
setattr(some_instance, 'string_type', some_instance.text_type)
some_instance.put()

retrieved_instance = db.get(some_instance.key())

print id(retrieved_instance.string_type) #10166414447674124776
print id(retrieved_instance.text_type) #10166414447721032360

print type(retrieved_instance.string_type) #OOPS: string_type is now Text type!

Upvotes: 0

Tim Hoffman
Tim Hoffman

Reputation: 12986

With your use of setattr setattr(some_instance, 'string_type', some_instance.text_type) you you have actually made instance property named string_type property actually point to the text_type property. So two names one property

The db, ndb, users, urlfetch, and memcache modules are imported.
> from google.appengine.ext import db
> class TestModel(db.Model):
...    string_type = db.StringProperty(default='', multiline=True)
...    text_type = db.TextProperty()
... 
> 
> some_instance = TestModel()
> some_instance.text_type = 'foobar'
> setattr(some_instance, 'string_type', some_instance.text_type)
> print type(some_instance.string_type)
<class 'google.appengine.api.datastore_types.Text'>
y> repr(some_instance.string_type)
"u'foobar'"
> id(some_instance.string_type)
168217452
> id(some_instance.text_type)
168217452
> some_instance.string_type.__class__
<class 'google.appengine.api.datastore_types.Text'>
> some_instance.text_type.__class__
<class 'google.appengine.api.datastore_types.Text'>
> dir(some_instance.text_type.__class__)

Note the id of the two properties above.

So you are effectively rebinding the instance level property definition, then this modified model is written with the type of the field to the datastore.

If you want to use setattr (though can't see why in this contrived example) you should be getting the value from the property first using setattr(some_instance.string_type,TestModel.string_type.get_value_for_datastore(some_instance) ) to get the value to be assigned and not rebind the instances property.

Upvotes: 0

Matthew H
Matthew H

Reputation: 5879

There's a length limit on StringProperty of 500 characters. I don't know for sure, but maybe the datastore will convert a StringProperty to a TextProperty if it goes over the limit.

Having said that, I doubt GAE would just change an indexed property to a non-indexed one implicitly. But it's all I can think of.

Upvotes: 0

Related Questions