lecstor
lecstor

Reputation: 5707

Schema Upgrades and last_modified attribute (GAE & NDB)

I've just finished setting up the foundations for performing schema upgrades on GAE's datastore using mapreduce. We're using NDB and many or our models utilise the auto_now keyword option to DateTimeProperty to set a last_modified attribute.

last_modified = ndb.DateTimeProperty( auto_now=True )

Of course, on running the mapreduce job which updates entities the last_modified attribute is updated as well which is not really what we want.

def upgrade_entity(entity):
    # modify entity
    yield op.db.Put(entity)

According to the docs you can override the value for a property with auto_now_add set, but not with auto_now.

I'm now thinking there may well be other circumstances where we don't want the last_modified attribute to be updated as well.

So, is there any way to preserve the entity's last_modified value or do we add another property or replace these properties with one's we can control and just set the values manually?


ok, so the consensus seems to be that I should be able to define an alternate version of the model which is only used by the mapreduce code, not the user facing code (I very much want to avoid having to shut down the site to do a schema upgrade) but I haven't been able to get this to work.

With the following setup the user facing code works properly (updates last_modifed) until I run mapreduce which also works properly (doesn't update last_modified). After running mapreduce the user facing code no longer updates last_modified..

models.py

class MyModel(ndb.Model):
    # model used by user facing code
    last_modified = ndb.DateTimeProperty( auto_now=True )

upgrade.py

class MyTmpModel(ndb.Model):
    # model used by mapreduce code
    @classmethod
    def _get_kind(cls):
        return 'MyModel'
    last_modified = ndb.DateTimeProperty( auto_now=False )

def upgrade_model(entity):
    # mapper function 
    # modify entity
    yield op.db.Put(entity)     

mapreduce.yaml

mapreduce:
- name: Upgrade Model
  mapper:
    input_reader: mapreduce.input_readers.DatastoreInputReader
    handler: upgrade.upgrade_model
    params:
    - name: entity_kind
      default: upgrade.MyTmpModel


ok, I'm going put my issues here down to the fact that I have been testing this in dev_server and the differences in the way things run there compared to the real gae server. I've concluded that in dev_server all the code is running in the same process and the different model versions are not getting along.. from the NDB model docs:

An application should not define two model classes with the same kind, even if they live in different modules. An application's kinds are considered a global "namespace".

I'll assume that I can rely on the fact that on the real gae server the mapreduce code will run in separate instances and these version clashes will not occur and it will not affect the user facing server instances so the setup above should work as expected.

Thanks Tim & Guido for your help.

cheers,

J

Upvotes: 1

Views: 434

Answers (1)

Guido van Rossum
Guido van Rossum

Reputation: 16890

The solution is to set auto_now=False in all your model definitions in the map/reduce code.

My suggestion for doing this with the least chance for errors:

Define a global constant that can be True or False which you use for all the auto_now settings in your model definitions. Then you have to change only that one line to change it from True to False for all models. You can even make it compute the value automatically based on some environment variable.

Upvotes: 1

Related Questions