johhny B
johhny B

Reputation: 1452

AsyncHTTPClient max_clients

I set the number of max AsyncHTTPClients in tornado as follows:

AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient",max_clients=2000)

As you can see I'm using the curl_httpclient. Assuming hardware can handle more clients would there be any other software limits within the OS (Linux in this case) or libcurl that would be in place. For example if i set the max_clients=10000 should this just work out of the box?

Also if i have a multiple processes running tornado that each use the AsyncHTTPClient will each process get max_clients or is the max_clients number shared across all processes?

UPDATE

Ok so the documentation states that:

If additional keyword arguments are given, they will be passed to the constructor of each subclass instance created. The keyword argument max_clients determines the maximum number of simultaneous fetch() operations that can execute in parallel on each IOLoop. Additional arguments may be supported depending on the implementation class in use.

So because each process has its own IOLoop i guess that means that each process can use up to max_clients

Upvotes: 1

Views: 1285

Answers (1)

Ben Darnell
Ben Darnell

Reputation: 22134

You'll probably also need to increase the file descriptor limit. Curl may use up to 4x (default of CURLM_MAXCONNECTS) as many file descriptors as you have max_clients (in addition to the other file descriptors that may be needed by your process).

There may be other limits that are specific to your environment, the network, or the site(s) you're crawling.

Upvotes: 2

Related Questions