Reputation: 890
I am currently running a Flask + SQLAlchemy + uWSGI + nginx stack for an API backend that I am developing for my application. I was attempting to see what is the maximum amount of concurrent connection I can have on my server by using ApacheBench and sending different amounts of concurrent requests to an endpoint on the server.
This endpoint would take a JSON request body, extract certain values, run a query, then return a JSON response based on the results of the query.
I ran a base test of 1 concurrent request for 10 requests, and I got an average response time of 60 milliseconds.
Running another test with 10 concurrent requests for 100 requests returned an average of 150ms, 100 concurrent requests for 1000 returned 1500ms and 500 concurrent requests returned around 7000-9000ms.
Concurrency Level: 500
Time taken for tests: 38.710 seconds
Complete requests: 5000
Failed requests: 0
Total transferred: 1310000 bytes
Total body sent: 1105000
HTML transferred: 110000 bytes
Requests per second: 129.17 [#/sec] (mean)
Time per request: 3870.986 [ms] (mean)
Time per request: 7.742 [ms] (mean, across all concurrent requests)
Transfer rate: 33.05 [Kbytes/sec] received
27.88 kb/s sent
60.93 kb/s total
Connection Times (ms)
min mean[+/-sd] median max
Connect: 24 63 185.1 25 3025
Processing: 161 3737 2778.7 3316 26719
Waiting: 157 3737 2778.7 3316 26719
Total: 187 3800 2789.7 3366 26744
Percentage of the requests served within a certain time (ms)
50% 3366
66% 4135
75% 4862
80% 5711
90% 7449
95% 9158
98% 11794
99% 13373
100% 26744 (longest request)
The latency seems to be linearly increasing which makes sense, but it seemed to increase TOO quickly. After doing a lot of tinkering and profiling, I found that the bottleneck was the queries.
At the start of the benchmark, the queries would process and return quickly under 10-50ms, but it quickly increased and in some cases latencies of 10000-15000 ms were being seen.
I couldn't figure out why the db was slowing down so much especially since it is empty (barring test data).
I tried running the application WITHOUT a connection pool, and the results showed that latency went down (7-9s to 5-6s). I did not think this was possible because everything I read suggested having a connection pool will always make things faster because you avoid the overhead of establishing a new connection every time you make a request.
I also experimented with INCREASING the connection pool size (from the default 5 to 50), and that decreased latency even more than the pool-less setup(5-6s to 3-4s).
Concurrency Level: 500
Time taken for tests: 4.842 seconds
Complete requests: 836
Failed requests: 0
Non-2xx responses: 679
Total transferred: 272673 bytes
Total body sent: 294593
HTML transferred: 126353 bytes
Requests per second: 172.67 [#/sec] (mean)
Time per request: 2895.662 [ms] (mean)
Time per request: 5.791 [ms] (mean, across all concurrent requests)
Transfer rate: 55.00 [Kbytes/sec] received
59.42 kb/s sent
114.42 kb/s total
Connection Times (ms)
min mean[+/-sd] median max
Connect: 24 170 273.1 93 1039
Processing: 25 665 1095.2 149 4753
Waiting: 25 665 1095.2 149 4753
Total: 51 835 1123.9 279 4786
Percentage of the requests served within a certain time (ms)
50% 279
66% 487
75% 1050
80% 1059
90% 2935
95% 3658
98% 4176
99% 4337
100% 4786 (longest request)
The latency is still extremely high (3-4 seconds for an API seems unreasonable by any standard), and I am trying to figure out how I can decrease it even more. Is the answer just more connections?
Note: I am running 4 uWSGI processes with 100 threads each on a server with 4GB ram and a quad core processor. I am using the psycopg2-cffi adapter for the connection, and the application is running on PyPy.
Upvotes: 3
Views: 3392
Reputation: 20508
The linear increase is very much normal, if the database has to process your queries sequentially. Essentially, all concurrent requests get started at the same time, but finish one after another, so, assuming a pool with a single connection, 60ms per request, and 10 concurrent requests, you're going to see requests taking 60ms, 120ms, 180ms, 240ms, 300ms, 360ms, 420ms, 480ms, 540ms, 600ms, 600ms, ..., 600ms, 540ms, 480ms, ... . We can calculate how much time it takes for the average request, given n
requests and m
concurrent requests:
f(n, m) = 60ms * (((m + 1) * m / 2) * 2 + (n - 2m) * m) / n
f(100, 10) = 546ms
f(1000, 100) = 5406ms
f(1000, 500) = 15,030ms
These numbers are similar to what you are seeing.
Now comes the big question. Why does the database process queries almost sequentially? I can think of a few reasons:
How do you fix this? This is a complicated question, but a few potential solutions:
Upvotes: 2