fortran
fortran

Reputation: 76107

Tornado AsyncHTTPClient requests timing out under medium load

I'm running a simple web app that uses a few web services for each request and I've found that the requests that our server makes some times time-out (the synthetic 599 error) even though the other service is responsive at all times (I've verified that).

Those are the kind of error messages that I get:

HTTP 599: Connection timed out after 7005 milliseconds

(timed out while connecting)

HTTP 599: Operation timed out after 5049 milliseconds with 0 out of -1 bytes received

(timed out before receiving data)

HTTP 599: Operation timed out after 10005 milliseconds with 11197 out of 13047 bytes received

(timed out with data partially transferred)

I've been able to reproduce this in two different environments, in an Amazon EC2 mini instance and my Macbook Pro (i7). In the EC2 instance the timeouts start happening with as few as 2 concurrent clients making requests, the Macbook holds up until 8 concurrent clients, then it starts showing timeouts as well.

I've tried a few things like updating the Tornado version (2.2, 2.3.1, 2.4.1 and 3.1.1 if I remember well), changing the underlying AsyncHTTPClient implementation from the default simple one to the pycurl based and increasing the number of async clients (to 200), but the error is still happening.

I'm not sure what can I be possibly doing wrong, because this does not look like the promised scalability that Tornado should deliver at all...

Any hints?

update

just for the record, we were using memcache in an async callback, but the library wasn't async itself. I replaced it with: https://github.com/dpnova/tornado-memcache/

That was the biggest issue I think, although we are still getting a 599 from time to time.

Upvotes: 3

Views: 1561

Answers (1)

Ben Darnell
Ben Darnell

Reputation: 22154

It sounds like your code might be blocking the event loop somewhere (for an integer number of seconds - do you have any calls to time.sleep()?). Try using IOLoop.set_blocking_log_threshold to find places where the event loop is being blocked.

Upvotes: 1

Related Questions