JLTChiu
JLTChiu

Reputation: 1023

Make a non-blocking request with requests when running Flask with Gunicorn and Gevent

My Flask application will receive a request, do some processing, and then make a request to a slow external endpoint that takes 5 seconds to respond. It looks like running Gunicorn with Gevent will allow it to handle many of these slow requests at the same time. How can I modify the example below so that the view is non-blocking?

import requests

@app.route('/do', methods = ['POST'])
def do():
    result = requests.get('slow api')
    return result.content
gunicorn server:app -k gevent -w 4

Upvotes: 24

Views: 23211

Answers (3)

jerry
jerry

Reputation: 539

You can use grequests. It allows other greenlets to run while the request is made. It is compatible with the requests library and returns a requests.Response object. The usage is as follows:

import grequests

@app.route('/do', methods = ['POST'])
def do():
    result = grequests.map([grequests.get('slow api')])
    return result[0].content

Edit: I've added a test and saw that the time didn't improve with grequests since gunicorn's gevent worker already performs monkey-patching when it is initialized: https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/ggevent.py#L65

Upvotes: 1

sytech
sytech

Reputation: 40861

If you're deploying your Flask application with gunicorn, it is already non-blocking. If a client is waiting on a response from one of your views, another client can make a request to the same view without a problem. There will be multiple workers to process multiple requests concurrently. No need to change your code for this to work. This also goes for pretty much every Flask deployment option.

Upvotes: 19

e4c5
e4c5

Reputation: 53734

First a bit of background, A blocking socket is the default kind of socket, once you start reading your app or thread does not regain control until data is actually read, or you are disconnected. This is how python-requests, operates by default. There is a spin off called grequests which provides non blocking reads.

The major mechanical difference is that send, recv, connect and accept can return without having done anything. You have (of course) a number of choices. You can check return code and error codes and generally drive yourself crazy. If you don’t believe me, try it sometime

Source: https://docs.python.org/2/howto/sockets.html

It also goes on to say:

There’s no question that the fastest sockets code uses non-blocking sockets and select to multiplex them. You can put together something that will saturate a LAN connection without putting any strain on the CPU. The trouble is that an app written this way can’t do much of anything else - it needs to be ready to shuffle bytes around at all times.

Assuming that your app is actually supposed to do something more than that, threading is the optimal solution

But do you want to add a whole lot of complexity to your view by having it spawn it's own threads. Particularly when gunicorn as async workers?

The asynchronous workers available are based on Greenlets (via Eventlet and Gevent). Greenlets are an implementation of cooperative multi-threading for Python. In general, an application should be able to make use of these worker classes with no changes.

and

Some examples of behavior requiring asynchronous workers: Applications making long blocking calls (Ie, external web services)

So to cut a long story short, don't change anything! Just let it be. If you are making any changes at all, let it be to introduce caching. Consider using Cache-control an extension recommended by python-requests developers.

Upvotes: 8

Related Questions