user423455
user423455

Reputation: 843

How to spawn for a different function after the first one finishes using gevent?

The basic idea is as follows:

a request comes to views1 and it first returns the username. There is some heavy job separate done by do_something_else right after views1 is done. You can think of this as creating a new user, but has to do some heavy checking on the background.

def views1(..):
   username = get_uername(...)
   return username

from lib import do_something_else
def do_something_else(...):
   // do heavy stuff here

gevent.joinall([
   gevent.spawn(views1, parmeter1, parmeter2, ...),
   gevent.spawn(do_something_else, parmeter1, parmeter2, ...)
])

The problem is I don't think do_something_else was ever called based on my logging. I read tutorial and I don't know where to place gevent.sleep(0). I don't want blocking. I want the user sees the username right away, and let do_something_else runs in the background.

Any idea?

Upvotes: 3

Views: 2066

Answers (1)

Alex
Alex

Reputation: 2975

It is important to understand that you need to separate 'heavy load' processing into thread pool [1].

Every processing that takes place in gevent thread (and you can have one gevent HUB per native thread) must be focused just on processing network requests and sending responses.

from gevent import spawn, run
from gevent.threadpool import ThreadPool
from time import sleep as heavy_load, time as now

class Globals:
    jobs = 4
    index = 0
    greenlets = []
    pool = ThreadPool(3) # change size of the pool appropriately

start = now()

def get_uername():
    heavy_load(0.1)
    Globals.index += 1
    return "Alex {0}".format(Globals.index)

def do_something_else(username):
    heavy_load(2.0)
    print "Heavy job done for", username, now() - start

def views1():
    "a request comes to views1 and it first returns the username"
    username = get_uername()
    ## There is some heavy job separate done by do_something_else right after views1 is done
    Globals.greenlets.append( 
        Globals.pool.spawn(do_something_else, username) 
        )
    # return username
    print "Returned requested username", username, now() - start


if __name__ == '__main__':
    ## simulate clients 
    for job_index in xrange(Globals.jobs):
        Globals.greenlets.append( spawn(views1) )

    ## wait for all tasks to complete
    # for greenlet in Globals.greenlets:
        # try:
            # greenlet.join()
        # except AttributeError, e:
            # greenlet.get()
    run()
    print "Test done", now() - start

This is output of the test:

python threadpool_test.py
Returned requested username Alex 1 0.101000070572
Returned requested username Alex 2 0.201999902725
Returned requested username Alex 3 0.302999973297
Returned requested username Alex 4 0.40299987793
Heavy job done for Alex 1 2.10100007057
Heavy job done for Alex 2 2.2009999752
Heavy job done for Alex 3 2.3029999733
Heavy job done for Alex 4 4.10299992561
Test done 4.10500001907

Notice how all requests are completed first and in parallel do_something_else tasks are done in batches of size 3.

When ThreadPool is not used every request would take additional time introduced by do_something_else and that is not asynchronous programming that gevent has to offer. In that case output would look like this:

Heavy job done for Alex 1 2.10100007057
Returned requested username Alex 1 2.10100007057
Heavy job done for Alex 2 4.2009999752
Returned requested username Alex 2 4.20199990273
Heavy job done for Alex 3 6.30200004578
Returned requested username Alex 3 6.3029999733
Heavy job done for Alex 4 8.40299987793
Returned requested username Alex 4 8.40400004387
Test done 8.40400004387

Notice how 4th request was completed ater 8.4 seconds instead 0.4 seconds when handled asynchronously.

[1] http://code.google.com/p/gevent/source/browse/examples/threadpool.py

Upvotes: 3

Related Questions