Can I asynchronously duplicate a webapp2.RequestHandler Request to a different url?

Question

For a percentage of production traffic, I want to duplicate the received request to a different version of my application. This needs to happen asynchronously so I don't double service time to the client.

The reason for doing this is so I can compare the responses generated by the prod version and a production candidate version. If their results are appropriately similar, I can be confident that the new version hasn't broken anything. (If I've made a functional change to the application, I'd filter out the necessary part of the response from this comparison.)

So I'm looking for an equivalent to:

class Foo(webapp2.RequestHandler):
  def post(self):
    handle = make_async_call_to('http://other_service_endpoint.com/', self.request)

    # process the user's request in the usual way

    test_response = handle.get_response()

    # compare the locally-prepared response and the remote one, and log
    # the diffs

    # return the locally-prepared response to the caller

UPDATE google.appengine.api.urlfetch was suggested as a potential solution to my problem, but it's synchronous in the dev_appserver, though it behaves the way I wanted in production (the request doesn't go out until get_response() is called, and it blocks). :

    start_time = time.time()
    rpcs = []

    print 'creating rpcs:'
    for _ in xrange(3):
        rpcs.append(urlfetch.create_rpc())
        print time.time() - start_time

    print 'making fetch calls:'
    for rpc in rpcs:
        urlfetch.make_fetch_call(rpc, 'http://httpbin.org/delay/3')
        print time.time() - start_time

    print 'getting results:'
    for rpc in rpcs:
        rpc.get_result()
        print time.time() - start_time


creating rpcs:
9.51290130615e-05
0.000154972076416
0.000189065933228
making fetch calls:
0.00029993057251
0.000356912612915
0.000473976135254
getting results:
3.15417003632
6.31326603889
9.46627306938

UPDATE2

So, after playing with some other options, I found a way to make completely non-blocking requests:

start_time = time.time()
rpcs = []

logging.info('creating rpcs:')
for i in xrange(10):
    rpc = urlfetch.create_rpc(deadline=30.0)
    url = 'http://httpbin.org/delay/{}'.format(i)
    urlfetch.make_fetch_call(rpc, url)
    rpc.callback = create_callback(rpc, url)
    rpcs.append(rpc)
    logging.info(time.time() - start_time)

logging.info('getting results:')
while rpcs:
    rpc = apiproxy_stub_map.UserRPC.wait_any(rpcs)
    rpcs.remove(rpc)
    logging.info(time.time() - start_time)

...but the important point to note is that none of the async fetch options in urllib work in the dev_appserver. Having discovered this, I went back to try @DanCornilescu's solution and found that it only works properly in production, but not in the dev_appserver.

Dan Cornilescu · Accepted Answer

The URL Fetch service supports asynchronous requests. From Issuing an asynchronous request:

HTTP(S) requests are synchronous by default. To issue an asynchronous request, your application must:

Create a new RPC object using urlfetch.create_rpc(). This object represents your asynchronous call in subsequent method calls.

Call urlfetch.make_fetch_call() to make the request. This method takes your RPC object and the request target's URL as parameters.

Call the RPC object's get_result() method. This method returns the result object if the request is successful, and raises an exception if an error occurred during the request.

The following snippets demonstrate how to make a basic asynchronous request from a Python application. First, import the urlfetch library from the App Engine SDK:
from google.appengine.api import urlfetch
Next, use urlfetch to make the asynchronous request:
rpc = urlfetch.create_rpc()
urlfetch.make_fetch_call(rpc, "http://www.google.com/")

# ... do other things ...
try:
    result = rpc.get_result()
    if result.status_code == 200:
        text = result.content
        self.response.write(text)
    else:
        self.response.status_code = result.status_code
        logging.error("Error making RPC request")
except urlfetch.DownloadError:
    logging.error("Error fetching URL0")

Note: As per Sniggerfardimungus's experiment mentioned in the question's update the async calls might not work as expected on the development server - being serialized instead of concurrent, but they do so when deployed on GAE. Personally I didn't use the async calls yet, so I can't really say.

If the intent is not block at all waiting for the response from the production candidate app you could push a copy of the original request and the production-prepared response on a task queue then answer to the original request - with neglijible delay (that of enqueueing the task).

The handler for the respective task queue would, outside of the original request's critical path, make the request to the staging app using the copy of the original request (async or not, doesn't really matter from the point of view of impacting the production app's response time), get its response and compare it with the production-prepared response, log the deltas, etc. This can be nicely wrapped in a separate module for minimal changes to the production app and deployed/deleted as needed.

Can I asynchronously duplicate a webapp2.RequestHandler Request to a different url?

Answers (1)

Related Questions