Reputation: 20409
I should perform thousands of URL fetch calls during the day. All calls are the same, just parameters changes - way
and date
.
Currently I use multiple cron entries to execute such calls:
- description: get data
url: /admin/getdata?d=way1,way2,way3,way4,...,way12
schedule: every day 8:30
- description: get data
url: /admin/getdata?d=way13,way14,way15,way16,...,way24
schedule: every day 8:40
...
- description: get data
url: /admin/getdata?d=way99,way100,way101,way102,...,way123
schedule: every day 9:20
Then in my getdata
handler I parse the d
parameter received and perform multiple urlfetch
es:
for date_ in dates:
for way in d:
response = urlfetch.Fetch('http://example.com?way='+way+'&date='+date_, deadline=60, headers=headers, follow_redirects=True)
But it doesn't help me a lot - still 60 seconds given for the cron job is not enough.
I was thinking about running cron job each ten minutes, but I should store somewhere possible way
s and date
s, mark already executed requests, then reset it (to be able to execute all again next day).
Is there any better way to do the same?
Upvotes: 0
Views: 316
Reputation: 11360
Or, one cron job which spawns taskqueued jobs for all the other urls. That can be done in the default module, for free. I would set a countdown
parameter, to space them out, to not spawn up too many instances. Simplifies app.yaml as well.
Upvotes: 1
Reputation: 41089
A better way is to have just one cron job per day that fetches all urls. All you need to do is to target this cron-job at a backend instance, which does not have a time limit.
Use Modules to create such an instance, and add a "target" setting to your cron job.
Upvotes: 1