Reputation: 1819
I am developping a python app on google app engine. I have a CRON job which imports everyday a list of 20 fresh files from a S3 bucket to a GS bucket.
Here is my code:
import webapp2
import yaml
from google.appengine.ext import deferred
class CronTask(webapp2.RequestHandler):
def get(self):
with open('/my/config/file') as file:
config_dict = yaml.load(file_config_file)
for file_to_load in config_dict:
deferred.defer(my_import_function, file_to_load)
app = webapp2.WSGIApplication([
('/', CronTask)
], debug=True)
Note that my_import_function
is part of another package and takes some time to be done.
My question: is it a good idea to use the function deferred.defer
for this task or should I proceed diferently to launch my_import_function
for all my arguments?
Upvotes: 0
Views: 162
Reputation: 16563
You should use the taskqueue, but depending on how many tasks you have you may not want to use deferred.defer()
.
With deferred.defer()
you can only enqueue one task per call. If you are enqueueing a lot of tasks, that is really inefficient. This is really slow:
for x in some_list:
deferred.defer(my_task, x)
With a lot of tasks, it is much more efficient to do something like this:
task_list = []
for x in some_list:
task_list.append(taskqueue.Task(url="/task-url",params=dict(x=x)))
taskqueue.Queue().add(task_list)
About a year ago, I did a timing comparison, and the latter was at least an order of magnitude faster than the former.
Upvotes: 2