Shirish
Shirish

Reputation: 87

Task Queues or Multi Threading on google app engine

I have my server on Google App Engine One of my jobs is to match a huge set of records with another. This takes very long, if i have to match 10000 records with 100. Whats the best way of implementing this.

Im, using Web2py stack and deployed my application on Google App Engine.

Upvotes: 1

Views: 1042

Answers (3)

Dan Sanderson
Dan Sanderson

Reputation: 2111

The basic structure for what you're doing is to have the cron job be responsible for dividing the work into smaller units, and executing each unit with the task queue. The payload for each task would be information that identifies the entities in the first set (such as a set of keys). Each task would perform whatever queries are necessary to join the entities in the first set with the entities in the second set, and store intermediate (or perhaps final) results. You can tweak the payload size and task queue rate until it performs the way you desire.

If the results of each task need to be aggregated, you can have each task record its completion and test for whether all tasks are complete, or just have another job that polls the completion records, to fire off the aggregation. When the MapReduce feature is more widely available, that will be a framework for performing this kind of work.

http://www.youtube.com/watch?v=EIxelKcyCC0 http://code.google.com/p/appengine-mapreduce/

Upvotes: 1

Peter Knego
Peter Knego

Reputation: 80330

Multithreading your code is not supported on GAE so you can not explicitly use it.

GAE itself can be multithreaded, which means that one frontend instance can handle multiple http requests simultaneously.

In your case, best way to achieve parallel task execution is Task Queue.

Upvotes: 1

Sam Holder
Sam Holder

Reputation: 32936

maybe i'm misunderstanding something, but thos sounds like the perfect match for a task queue, and i can't see how multithreading will help, as i thought this only ment that you can serve many responses simultaneously, it won't help if your responses take longer than the 30 second limit.

With a task you can add it, then process until the time limit, then recreate another task with the remainder of the task if you haven't finished your job by the time limit.

Upvotes: 1

Related Questions