Reputation: 51
Is there a limit on the Ubuntu OS maxing out the number of requests that can be handled by a machine?
To begin tackling this problem I created an Nginx proxy server with a single backend server handling incoming POST requests. Next, I would send a total of 1000 request at .001 seconds per request (i.e. 1000 requests/sec) to the proxy server address using the package pycurl. (I also used the python package requests with similar results). On the server-side, it would take roughly 4 seconds, or a rate of 250 rps, to complete the requests.
When I extended this number to send 10,000 requests at ever .001 seconds per request, still 1000 rps, there appeared to be some batch limitation placed on the number of requests the Nginx proxy server could receive before being handled. That limit seems to be 1200.
Order of events
At this rate of 1000rps for a total of 10,000 requests from the client, the server-side rps came out to still roughly 250 with one backend server and a proxy server.
I extended my backend by including eight additional threads, the thinking being this would increase the throughput response rate. (The machine being used is an AWS Lightsail instance with 8 cores). However, it didn't. (I've exhausted almost all Nginx proxy server configuration setups, that isn't an issue). Even with 8 threads and a proxy server, the RPS on the server-side came out to roughly 250 rps with the same batch request handling processing appearing to occur. (Again, this was checked by watching the nginx status as requests were sent).
I have also tested sending requests directly to one of the backend servers, with again a throughput rate of roughly 250 rps. You would imagine increasing/decreasing thread counts would effect results, but instead they all consistently handle only incoming POST requests at roughly 250 rps.
Edited: The backend server process takes roughly .0002 seconds to read in a request and send out a response. (This metric excludes network latency). The backend is a super short process that calculates a number after receiving the request.
Setups
client ---> proxy server ---> one thread handling responses ~ 250 rps
client ---> proxy server ---> eight thread handling responses ~ 250 rps
client ---> one thread handling responses ~ 250 rps
Again these are POST requests, not GET requests. (The data packets being sent are not big).
Given this background information, is there some OS-level setting that limits the rate at which a machine can handle or receive requests? If so, how can that be changed? If not, are there network limitations that could be causing this issue? If so, how can I test or change those limitations?
Upvotes: 1
Views: 750
Reputation: 4368
[This is an updated copy of my comment]
Regarding nginx, this might be related: https://www.nginx.com/blog/tuning-nginx/#Tuning-Your-NGINX-Configuration - check worker_processes
and worker_connections
.
Besides that it seems to be limited by the backend server - you haven't said much about what does that look like and how it's handling the requests. You should monitor/trace your process to determine what it's doing.
You can also use tools like netstat
, ss
, tcpdump
et al. to monitor & trace connections on your backend server.
One of the limitations is obviously the maximum number of ports available for connections: for a single source & destination this is somewhere about 28K (https://blog.box.com/ephemeral-port-exhaustion-and-web-services-at-scale) - if you're rapidly creating and closing connections you might be limited by the TIME_WAIT state (see the link above) to which every socket is moved when a connection is being closed. However, you said you generated at most 10K connections so it seems it's something different.
Upvotes: 2