Reputation: 17173
Here is my current method:
def delete_up_to_10000(query):
for i in range(10):
keys = query.fetch(1000, keys_only=True, deadline=40, batch_size=1000)
ndb.delete_multi(keys)
My question is, Is it possible to delete the results of the query without actually having to fetch the keys? Shouldn't that be possible?
Here are a few decision points around my current solution:
Upvotes: 1
Views: 604
Reputation: 17173
Here's my current solution now:
def _delete_from_query(query, limit, batch_size=2000):
delete_count = 0
next_curs = None
while True:
lim = min(batch_size, limit - delete_count)
keys, next_curs, more = query.fetch_page(
lim, start_cursor=next_curs, deadline=40, batch_size=lim, keys_only=True
)
ndb.delete_multi(keys)
delete_count += len(keys)
if not keys or not more or delete_count == limit:
break
return delete_count
Upvotes: 1
Reputation: 41089
The keys-only query does not retrieve the entities. It looks at the indexes, but only the indexes that you specified in the query.
"Delete" operation, on the other hand, must delete not only the entity itself, it must also delete an entry into each and every index for this entity - whether it's an index for a property or a composite index.
Thus, a query simply does not have all the information necessary to perform a delete operation at the same time. And the hypothetical "delete what you find" operation will be just a shorthand for "find a list of keys, then use these keys to update all indexes and remove an entity itself"." It may remove some overhead, but at the cost of greater complexity.
Upvotes: 2
Reputation: 2542
You need to fetch the keys in order to do the delete. Are you trying to mass delete and are simply spreading it out? You should look into a mapper (ie mapreduce). Its perfect for going through large amounts of datastore entries and deleting. You could run the map job once a day / week to keep your data under control.
Upvotes: 2