Reputation: 20890
I've got a situation where I want to make 1000 different queries to the datastore, do some calculations on the results of each individual query (to get 1000 separate results), and return the list of results.
I would like the list of results to be returned as the response from the same 30-second user request that started the calculation, for better client-side performance. Hah!
I have a bold plan.
Each of these operations individually will usually have no problem finishing in under a second, none of them need to write to the same entity group as any other, and none of them need any information from any of the other queries. Might it be possible to start 1000 independent tasks, each taking on one of these queries, doing its calculations, and storing the result in some sort of temporary collection of entities? The original request could wait 10 seconds, and then do a single query for the results from the datastore (maybe they all set a unique value I can query on). Any results that aren't in yet would be noticed at the client end, and the client could just ask for those values again in another ten seconds.
The questions I hope experienced appengineers can answer are:
Upvotes: 1
Views: 153
Reputation: 101139
The Task Queue doesn't provide firm guarantees on when a task will execute - the ETA (which defaults to the current time) is the earliest time at which it will execute, but if the queue is backed up, or there are no instances available to execute the task, it could execute much later.
One option would be to use Datastore Plus / NDB, which allows you to execute queries in parallel. 1000 queries is going to be very expensive, however, no matter how you execute them.
Another option, as @Chris suggests, is to use the task queue with the Channel API, so you can notify the user asynchronously when the queries complete.
Upvotes: 1
Reputation: 14175
Yep, sounds pretty ludicrous :)
You shouldn't rely on the Taskqueue to operate like that. You can't rely on 1000 tasks being spawned that quickly (although they most likely will).
Why not use the Channel API to wait for your response. So your solution becomes:
This would avoid any timeout issues that would very likely arrise from time to time due to tasks not executing as fast as you like, or some other reason.
Upvotes: 1