Jontonsoup
Jontonsoup

Reputation: 171

Python Tornado Blocking in AsyncHTTPClient only on Linux. Not on windows or OSX

I'm getting very strange behavior from my Tornado AsyncHTTPClient client.

When I run the same code on Windows, OSX, Ubuntu, Redhat, and the Amazon AMI, my code behaves differently.

Here is the relevant code:

 request = HTTPRequest(self.URL,
                      method="POST",
                      auth_username=self.USERNAME,
                      auth_password=self.PASSWORD,
                      headers=self.HEADERS,
                      body=formatted_request
                      )
try:
  print "before", datetime.now()
  future = self.HTTP_CLIENT.fetch(request, self.handle_response)
  print "after", datetime.now()

On OSX and Winodws, the output of this code is (Non blocking):

before 2015-10-27 17:51:13.896538
after 2015-10-27 17:51:14.414656
before 2015-10-27 17:51:14.418626
after 2015-10-27 17:51:14.420233
before 2015-10-27 17:51:14.423062
after 2015-10-27 17:51:14.424126
before 2015-10-27 17:51:14.426491
after 2015-10-27 17:51:14.427542
before 2015-10-27 17:51:14.429675
after 2015-10-27 17:51:14.430702
before 2015-10-27 17:51:14.432825
after 2015-10-27 17:51:14.433863

On Ubuntu, Redhat, and the amazon AMI I am getting this (a difference of 2 seconds in between what is supposed to be non blocking code):

before 2015-10-27 21:49:23.644458
after 2015-10-27 21:49:25.541746
before 2015-10-27 21:49:25.542827
after 2015-10-27 21:49:27.428840
before 2015-10-27 21:49:27.429993
after 2015-10-27 21:49:29.326183
before 2015-10-27 21:49:29.327549

I noticed in the tornado code that there is a difference between linux and osx:

We use epoll (Linux) or kqueue (BSD and Mac OS X) if they are available, or else we fall back on select(). If you are implementing a system that needs to handle thousands of simultaneous connections, you should use a system that supports either epoll or kqueue.

But the performance difference between the different platforms seems unlikely to be a epoll / kqueue issue.

I'm using python 2.7 and tornado 4.2.1. Distribution versions are the EC2 versions downloaded from the AWS instance start page.

Any ideas would be appreciated!

Thanks, Jon

Upvotes: 1

Views: 263

Answers (1)

Ben Darnell
Ben Darnell

Reputation: 22134

The difference is probably in the DNS resolution, which is blocking by default. When it's fast, you're getting a cached result, and when it's not you're going out to the original nameservers (and probably talking to a non-optimal resolver if it's taking 2 seconds).

Try installing the futures package and doing tornado.netutil.Resolver.configure("tornado.netutil.ThreadedResolver").

Upvotes: 4

Related Questions