lonewaft
lonewaft

Reputation: 890

Improving performance for SQLAlchemy?

I am currently running a Flask + SQLAlchemy + uWSGI + nginx stack for an API backend that I am developing for my application. I was attempting to see what is the maximum amount of concurrent connection I can have on my server by using ApacheBench and sending different amounts of concurrent requests to an endpoint on the server.

This endpoint would take a JSON request body, extract certain values, run a query, then return a JSON response based on the results of the query.

I ran a base test of 1 concurrent request for 10 requests, and I got an average response time of 60 milliseconds.
Running another test with 10 concurrent requests for 100 requests returned an average of 150ms, 100 concurrent requests for 1000 returned 1500ms and 500 concurrent requests returned around 7000-9000ms.

Concurrency Level:      500
Time taken for tests:   38.710 seconds
Complete requests:      5000
Failed requests:        0
Total transferred:      1310000 bytes
Total body sent:        1105000
HTML transferred:       110000 bytes
Requests per second:    129.17 [#/sec] (mean)
Time per request:       3870.986 [ms] (mean)
Time per request:       7.742 [ms] (mean, across all concurrent requests)
Transfer rate:          33.05 [Kbytes/sec] received
                        27.88 kb/s sent
                        60.93 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       24   63 185.1     25    3025
Processing:   161 3737 2778.7   3316   26719
Waiting:      157 3737 2778.7   3316   26719
Total:        187 3800 2789.7   3366   26744

Percentage of the requests served within a certain time (ms)
  50%   3366
  66%   4135
  75%   4862
  80%   5711
  90%   7449
  95%   9158
  98%  11794
  99%  13373
 100%  26744 (longest request)

The latency seems to be linearly increasing which makes sense, but it seemed to increase TOO quickly. After doing a lot of tinkering and profiling, I found that the bottleneck was the queries.

At the start of the benchmark, the queries would process and return quickly under 10-50ms, but it quickly increased and in some cases latencies of 10000-15000 ms were being seen.

I couldn't figure out why the db was slowing down so much especially since it is empty (barring test data).

I tried running the application WITHOUT a connection pool, and the results showed that latency went down (7-9s to 5-6s). I did not think this was possible because everything I read suggested having a connection pool will always make things faster because you avoid the overhead of establishing a new connection every time you make a request.

I also experimented with INCREASING the connection pool size (from the default 5 to 50), and that decreased latency even more than the pool-less setup(5-6s to 3-4s).

Concurrency Level:      500
Time taken for tests:   4.842 seconds
Complete requests:      836
Failed requests:        0
Non-2xx responses:      679
Total transferred:      272673 bytes
Total body sent:        294593
HTML transferred:       126353 bytes
Requests per second:    172.67 [#/sec] (mean)
Time per request:       2895.662 [ms] (mean)
Time per request:       5.791 [ms] (mean, across all concurrent requests)
Transfer rate:          55.00 [Kbytes/sec] received
                        59.42 kb/s sent
                        114.42 kb/s total

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:       24  170 273.1     93    1039
Processing:    25  665 1095.2    149    4753
Waiting:       25  665 1095.2    149    4753
Total:         51  835 1123.9    279    4786

Percentage of the requests served within a certain time (ms)
  50%    279
  66%    487
  75%   1050
  80%   1059
  90%   2935
  95%   3658
  98%   4176
  99%   4337
 100%   4786 (longest request) 

The latency is still extremely high (3-4 seconds for an API seems unreasonable by any standard), and I am trying to figure out how I can decrease it even more. Is the answer just more connections?

Note: I am running 4 uWSGI processes with 100 threads each on a server with 4GB ram and a quad core processor. I am using the psycopg2-cffi adapter for the connection, and the application is running on PyPy.

Upvotes: 3

Views: 3392

Answers (1)

univerio
univerio

Reputation: 20508

The linear increase is very much normal, if the database has to process your queries sequentially. Essentially, all concurrent requests get started at the same time, but finish one after another, so, assuming a pool with a single connection, 60ms per request, and 10 concurrent requests, you're going to see requests taking 60ms, 120ms, 180ms, 240ms, 300ms, 360ms, 420ms, 480ms, 540ms, 600ms, 600ms, ..., 600ms, 540ms, 480ms, ... . We can calculate how much time it takes for the average request, given n requests and m concurrent requests:

f(n, m) = 60ms * (((m + 1) * m / 2) * 2 + (n - 2m) * m) / n
f(100, 10) = 546ms
f(1000, 100) = 5406ms
f(1000, 500) = 15,030ms

These numbers are similar to what you are seeing.

Now comes the big question. Why does the database process queries almost sequentially? I can think of a few reasons:

  • Locking: each started transaction needs to lock some resource exclusively so only one (or few) transaction can run at a time
  • CPU-bound query: each transaction takes significant CPU resources so other transactions must wait for CPU time
  • Large table scans: the database cannot keep the entirety of the table in memory and therefore must read from disk for every transaction

How do you fix this? This is a complicated question, but a few potential solutions:

  • Optimize your queries; either optimize it so that they don't all fight for the same resource, or, optimize it so that they don't take as long
  • Batch your queries so that you need to run fewer in total
  • Cache your responses so that they don't hit the database at all

Upvotes: 2

Related Questions