Reputation: 155
I have a web app that is also doing really intensive data processing. Some of the functions are extremely slow (think a couple of minutes).
Until now I had an architecture that was spawning new threads/process per connection so those slow functions don't block other users. But this is consuming way too much memory and it is against tornado architecture.
So I am wondering if there is a solution to this kind of issue. My code look like this:
# code that is using to much memory because of the new threads being spawned
def handler():
thread = Thread(target = really_slow_function)
thread.start()
thread.join()
return "done"
def really_slow_function():
# this is an example of an intensive function
# which should be treated as a blackbox
sleep(100)
return "done"
After refactoring I have the following code:
#code that doesn't scale because all the requests are block on that one slow request.
@gen.coroutine
def handler():
yield really_slow_function()
raise gen.Return("done")
def really_slow_function():
# this is an example of an intensive function
# which should be treated as a blackbox
sleep(100)
return "done"
The issue with this refactor is that the tornado server is blocking on the really_slow_function
and not able to serve the other requests in the meantime.
So the question is: is there a way of refactoring the handler WITHOUT touching the really_slow_function
and WITHOUT creating new threads/process?
Upvotes: 1
Views: 836
Reputation: 22134
Use a ThreadPoolExecutor
(from the concurrent.futures
package) to run long-running function in separate threads without starting a new thread each time.
async def handler():
await IOLoop.current().run_in_executor(None, really_slow_function)
return "done"
If you want to control exactly how many threads are eligible to run this function, you can make your own executor and pass it instead of None
.
Upvotes: 1