Reputation: 241
my problem is how to best release memory the response of an asynchrones url fetch needs on appengine. Here is what I basically do in python:
rpcs = []
for event in event_list:
url = 'http://someurl.com'
rpc = urlfetch.create_rpc()
rpc.callback = create_callback(rpc)
urlfetch.make_fetch_call(rpc, url)
rpcs.append(rpc)
for rpc in rpcs:
rpc.wait()
In my test scenario it does that for 1500 request. But I need an architecture to handle even much more within a short amount of time.
Then there is a callback function, which adds a task to a queue to process the results:
def event_callback(rpc):
result = rpc.get_result()
data = json.loads(result.content)
taskqueue.add(queue_name='name', url='url', params={'data': data})
My problem is, that I do so many concurrent RPC calls, that the memory of my instance crashes: "Exceeded soft private memory limit with 159.234 MB after servicing 975 requests total"
I already tried three things:
del result
del data
and
result = None
data = None
and I ran the garbage collector manually after the callback function.
gc.collect()
But nothing seem to release the memory directly after a callback functions has added the task to a queue - and therefore the instance crashes. Is there any other way to do it?
Upvotes: 3
Views: 771
Reputation: 2459
Use the task queue for urlfetch as well, fan out and avoid exhausting memory, register named tasks and provide the event_list cursor to next task. You might want to fetch+process in such a scenario instead of registering new task for every process, especially if process also includes datastore writes.
I also find ndb to make these async solutions more elegant.
Check out Brett Slatkins talk on scalable apps and perhaps pipelines.
Upvotes: 1
Reputation: 469
Wrong approach: Put these urls into a (put)-queue, increase its rate to the desired value (defaut: 5/sec), and let each task handle one url-fetch (or a group hereof). Please note that theres a safety limit of 3000 url-fetch-api-calls / minute (and one url-fetch might use more than one api-call)
Upvotes: 2