Reputation: 458
I have entity in datastore with fields.
created_date = ndb.DateTimeProperty(auto_now_add=True)
epoch = ndb.IntegerProperty()
sent_requests = ndb.JsonProperty()
I would like bulk to delete all those entities which are older than 2 days using daily cron job. I am aware of ndb.delete_multi(list_of_keys)
but how do i get list of keys which are older than 2 days? Is scanning entire datastore with 100+ million entity and getting list of keys where epoch < int(time.time()) - 2*86400
the best option available?
Upvotes: 0
Views: 796
Reputation: 39824
Yes, because you only want to delete some of the entities you need to perform (keys_only) queries to obtain the keys to pass to ndb.delete_multi()
(or its async version?). Don't worry about the number of entities, all queries are index-based, the response time doesn't depend on how many entities exist in the datastore.
But it may take some time for the index to be updated after the deletions, so use query cursors, not repeated identical queries (which could return keys already deleted).
Also, if you expect to delete a lot of entities, spread the load in multiple requests (for example using the task queue or the deferred library) to prevent exceeding the request deadline. See, for example, How to delete all the entries from google datastore?
Upvotes: 2