Reputation: 837
System specification - MacOS 10.13.6 - Python 3.7.0 - Tornado 5.1.1
I'd like to use a ThreadPoolExecutor to run blocking functions in a Tornado instance that serves a RESTful service.
The ThreadPool works as expected and spawns four worker threads in parallel (see code and console log below) as long as I am not trying to yield results returned by the executed function.
ThreadPoolExecutor without yielding results
import time
import tornado.web
from tornado.concurrent import run_on_executor
from concurrent.futures import ThreadPoolExecutor
from tornado.ioloop import IOLoop
MAX_WORKERS = 4
i = 0
class Handler(tornado.web.RequestHandler):
executor = ThreadPoolExecutor(max_workers=MAX_WORKERS)
@run_on_executor
def background_task(self, i):
print("going sleep %s" % (i))
time.sleep(10);
print("waking up from sleep %s" % (i))
return str(i)
@tornado.gen.coroutine
def get(self):
global i
i+=1
self.background_task(i)
def make_app():
return tornado.web.Application([
(r"/", Handler),
])
if __name__ == "__main__":
app = make_app()
app.listen(8000, '0.0.0.0')
IOLoop.current().start()
Console output
going sleep 1
going sleep 2
going sleep 3
going sleep 4
waking up from sleep 1
going sleep 5
waking up from sleep 2
going sleep 6
waking up from sleep 3
going sleep 7
waking up from sleep 4
going sleep 8
waking up from sleep 5
going sleep 9
waking up from sleep 6
waking up from sleep 7
waking up from sleep 8
waking up from sleep 9
As can be seen, four parallel workers are running, and as soon as one finishes, a queued function is being executed.
However, I have some issues when I try to yield return functions using a coroutine. While it does not fully block the IoLoop, it gets delayed without a clear pattern.
Altered code: ThreadPoolExecutor now yielding result
MAX_WORKERS = 4
i = 0
class Handler(tornado.web.RequestHandler):
executor = ThreadPoolExecutor(max_workers=MAX_WORKERS)
@run_on_executor
def background_task(self, i):
print("%s: Going sleep %s" % (time.time(), i))
time.sleep(10);
print("%s: Waiking up from sleep %s" % (time.time(), i))
return str(i)
@tornado.gen.coroutine
def get(self):
global i
i+=1
result = yield self.background_task(i)
self.write(result)
Looking at the console output, only a single thread runs for the 1st and 2nd request, executing the queueing tasks once the single thread is finished (causing a delay of 10seconds to start task 2 and another to start task 3). However, task 3,4,5 and 6 are executed in parallel but with varying delays between each call.
Console output
1548687401.331075: Going sleep 1
1548687411.333173: Waking up from sleep 1
1548687411.340162: Going sleep 2
1548687421.3419871: Waking up from sleep 2
1548687421.347039: Going sleep 3
1548687423.4030259: Going sleep 4
1548687423.884313: Going sleep 5
1548687424.6828501: Going sleep 6
1548687431.351986: Waking up from sleep 3
1548687431.3525162: Going sleep 7
1548687433.407232: Waking up from sleep 4
1548687433.407604: Going sleep 8
1548687433.8846452: Waking up from sleep 5
1548687433.885139: Going sleep 9
1548687434.685195: Waking up from sleep 6
1548687434.685662: Going sleep 10
1548687441.3577092: Waking up from sleep 7
1548687441.358009: Going sleep 11
1548687443.412503: Waking up from sleep 8
1548687443.888705: Waking up from sleep 9
1548687444.691127: Waking up from sleep 10
1548687451.359714: Waking up from sleep 11
Can anyone explain this behaviour? Do you have any fix?
Upvotes: 8
Views: 1320
Reputation: 1781
Currently Python Tornado memory leaks when using ThreadPoolExecutor, so Tornado is not usable.
Upvotes: 0
Reputation: 1
Based on the configuration of the CPU and workload, the process split the task. We can verify with simple start chrome options to test concurrency.
Upvotes: -1