Reputation: 794
I am trying to improve efficiency of my current query from appengine datastore. Currently, I am using a synchronous method:
class Hospital(ndb.Model):
name = ndb.StringProperty()
buildings= ndb.KeyProperty(kind=Building,repeated=True)
class Building(ndb.Model):
name = ndb.StringProperty()
rooms= ndb.KeyProperty(kind=Room,repeated=True)
class Room(ndb.Model):
name = ndb.StringProperty()
beds = ndb.KeyProperty(kind=Bed,repeated=True)
class Bed(ndb.Model):
name = ndb.StringProperty()
.....
Currently I go through stupidly:
currhosp = ndb.Key(urlsafe=valid_hosp_key).get()
nbuilds = ndb.get_multi(currhosp.buildings)
for b in nbuilds:
rms = ndb.get_multi(b.rooms)
for r in rms:
bds = ndb.get_multi(r.beds)
for b in bds:
do something with b object
I would like to transform this into a much faster query using get_multi_async
My difficulty is in how I can do this? Any ideas?
Best Jon
Upvotes: 8
Views: 2273
Reputation: 794
using the given structures above, it is possible, and was confirmed that you can solve this with a set of tasklets. It is a SIGNIFICANT speed up over the iterative method.
@ndb.tasklet
def get_bed_info(bed_key):
bed_info = {}
bed = yield bed_key.get_async()
format and store bed information into bed_info
raise ndb.Return(bed_info)
@nbd.tasklet
def get_room_info(room_key):
room_info = {}
room = yield room_key.get_async()
beds = yield map(get_bed_info,room.beds)
store room info in room_info
room_info["beds"] = beds
raise ndb.Return(room_info)
@ndb.tasklet
def get_building_info(build_key):
build_info = {}
building = yield build_key.get_async()
rooms = yield map(get_room_info,building.rooms)
store building info in build_info
build_info["rooms"] = rooms
raise ndb.Return(build_info)
@ndb.toplevel
def get_hospital_buildings(hospital_object):
buildings = yield map(get_building_info,hospital_object.buildings)
raise ndb.Return(buildings)
Now comes the main call from the hospital function where you have the hospital object (hosp).
hosp_info = {}
buildings = get_hospital_buildings(hospital_obj)
store hospital info in hosp_info
hosp_info["buildings"] = buildings
return hosp_info
There you go! It is incredibly efficient and lets the schedule complete all the information in the fastest possible manner within the GAE backbone.
Upvotes: 12
Reputation: 16890
You can do something with query.map(). See https://developers.google.com/appengine/docs/python/ndb/async#tasklets and https://developers.google.com/appengine/docs/python/ndb/queryclass#Query_map
Upvotes: 3
Reputation: 469
Its impossible. Your 2nd query (ndb.get_multi(b.rooms)) depends on the result of your first query. So pulling it async dosnt work, as at this point the (first) result of the first query has to be avaiable anyway. NDB does something like that in the background (it allready buffers the next items of ndb.get_multi(currhosp.buildings) while you process the first result). However, you could use denormalization, i.e. keeping a big table with one entry per Building-Room-Bed pair, and pull your results from that table. If you have more reads than writes to this table, this will get you a massive speed improvement (1 DB read, instead of 3).
Upvotes: -1