Reputation: 366
My simple question: How can I increase the possible number of connections of my Amazon RDS Database? I used a parameter group where I set
max_connections = 30000
which seems to work on the first hand, as
SHOW VARIABLES LIKE 'max_connections';
returns the expected. But when I run a stress test the monitoring metrics always show a maximum number of 1200 connections. So obviously there have to be other limiting factors, I just don't know. Any help would be highly appreciated.
My test setup: 1 Load Balancer 8 fat EC2 instances (m4.4xlarge) (which is a bit overdimensioned, but I'm still testing) 1 DB: r3.4xlarge with 140 GB memory, 1 TB storage and 10.000 provisioned IOPS
Test: 30.000 virtual users in 10 minutes making 4 requests each (2 reading the DB, 1 writing it, 1 not using the DB). Fails after about two minutes because of too many errors (caused by DB timeouts).
Concerning the hardware this setup should be able to handle the test requests, shouldn't it? So I hope I'm just missing the obvious and there's a parameter which has to be adapted to make everything working.
Upvotes: 2
Views: 7195
Reputation: 366
Thanks to Michael and the hints of a colleague I was finally able to solve this problem: As Michael already supposed it wasn't caused by the DB. The answer was hidden in the Apache configuration which I took under examination after DB problems seem to be out of question (finally).
All my eight EC2 instances were limited by MaxRequestWorkers=150 (-> 8*150=1200). What is obvious for every holiday admin took me day. At least everything's working now.
Upvotes: 1
Reputation: 179442
I would strongly suggest that the first problem is not with the configuration of the server, but with your test methodology and interpretation of what you are seeing.
Hitting max_connections does not initially cause "db timeouts." It causes connection errors, because the server actively rejects excessive connection attempts, with a refusal to negotiate further. This is not the same thing as a timeout.
At what point, during what operation, are the timeouts occurring? Initial connection phase? That's not going to be related to max_connections, at least not directly.
The maximum connections you observe seems like a suspiciously round number and potentially is even derivable from your test parameters... You mentioned 30000 users and 10 minutes and 4 requests... and 30000 × 4 ÷ 10 ÷ 10 = 1200. Yes, I threw in the "10" twice for no particular reason other than 1200 just seems very suspicious. I wonder whether, if you used 15000 users, the number would drop from 1200 to 600. That would be worth investigating.
Importantly, to serve 30000 concurrent users, your application does not need 30000 database connections. If it does, it's written very, very badly. I don't know how you're testing this, but only a naive implementation given the stated parameters would assume 30000 connections should be established.
Equally important, 30000 connections to a single MySQL server regardless of size seems completely detached from reality, except maybe with thread pooling, which isn't available in the version of MySQL used in RDS. If you were to successfully create that many connections, on a cold server or one without a massive thread cache already warmed up, it would likely take several minutes just for the OS to allow MySQL to create that many new threads. You would indeed see timeouts here, because the OS would not let the server keep up with the incoming demand, but it would be unrelated to max_connections.
It would seem like your most likely path at this point would not be to assume that max_connections isn't actually set to the value that it claims, and to scale down your test parameters, see how the behavior changes and go from there in an effort to understand what is actually happening. Your test parameters also need to be meaningful related to the actual workload you're trying to test against.
Upvotes: 5