TTT
TTT

Reputation: 4434

Requests timeout between App engine and EC2

My webapp has two parts:

  1. a GAE server which handles web requests and sends them to an EC2 REST server
  2. an EC2 REST server which does all the calculations given information from GAE and sends back results

It works fine when the calculations are simple. Otherwise, I would have timeout error on the GAE side.

I realized that there are some approaches for this timeout issue. But after some researches, I found (please correct me if I am wrong):

  1. taskqueue would not fit my needs since some of the calculations could take more than half an hours.
  2. 'GAE backend instance' works when I reserved another instance all the time. But since I have already resered an EC2 instance, I would like to find some "cheap" solutions (not paying GAE backend instance and EC2 at the same time)
  3. 'GAE Asynchronous Requests' also not an option, since it still wait for response from EC2 although users can send other requests while they are waiting

Below is a simple case of my code, and it asks:

  1. users to upload a csv
  2. parse this csv and send information to EC2
  3. generate output page given response from EC2

OutputPage.py

from przm import przm_batchmodel
    class OutputPage(webapp.RequestHandler):
    def post(self):
        form = cgi.FieldStorage()
        thefile = form['upfile']

        #this is where uploaded file will be processed and sent to EC2 for computing
        html= przm_batchmodel.loop_html(thefile)  
        przm_batchoutput_backend.przmBatchOutputPageBackend(thefile)
        self.response.out.write(html)
    app = webapp.WSGIApplication([('/.*', OutputPage)], debug=True)

przm_batchmodel.py### This is the code which sends info. to EC2

    def loop_html(thefile):
        #parses uploaded csv and send its info. to the REST server, the returned value is a html page. 
        data= csv.reader(thefile.file.read().splitlines())
        response = urlfetch.fetch(url=REST_server, payload=data, method=urlfetch.POST, headers=http_headers, deadline=60)   
        return response

At this moment, my questions are:

  1. Is there a way on the GAE side allow me to just send the request to EC2 without waiting for its response? If this is possible, on the EC2 side, I can send users emails to notify them when the results are ready.
  2. If question 1 is not possible. Is there a way to create a monitor on EC2 which will invoke the calculation once information are received from GAE side?

I appreciate any suggestions.

Upvotes: 0

Views: 252

Answers (1)

Romin
Romin

Reputation: 8816

Here are some points:

  • For Question 1 : You do not need to wait on the GAE side for EC2 to complete its work. You are already using URLFetch to send the data across to EC2. As long as it is able to send that data across over to the EC2 side within 60 seconds and its size is not more than 10MB, then you are fine.

  • You will need to make sure that you have a Receipt Handler on the EC2 side that is capable of collecting this data from above and sending back an Ack. An Ack will be sufficient for the GAE side to track the activity. You can then always write some code on the EC2 side to send back the response to the GAE side that the conversion is done or as you mentioned, you could send an email off if needed.

  • I suggest that you create your own little tracker on the GAE side. For e.g. when the File is uploaded, created a Task and send back the Ack immediately to the client. Then you can use a Cron Job or Task Queue on the App Engine side to simply send off the work to EC2. Do not wait for EC2 to complete its job. Then let EC2 report back to GAE that its work is done for a particular Task Id and send off and email (if required) to notify the users that the work is done. In fact, EC2 can even report back with a batch of Task Ids that it completed, instead of sending a notification for each Task Id.

Upvotes: 3

Related Questions