I have Django running on AWS Lambda connecting to MySQL on RDS. Everything works great most of the time. However, if a spike executes 10000 concurrent requests this spawns many Lambda containers and each opens a database connection which will eventually exceed RDS connection limits. (as per https://serverfault.com/questions/862387/aws-rds-connection-limits ) What is the best strategy (if any) to get this to web-ish scale without losing SQL. Some ideas: Is there a way to limit the number of AWS containers to encourage/force re-use rather than spawning more? (e.g. Set max lambda containers as some proportion of db connection limit). If container limit is reached the connection could wait until a hot aws container is available. Can API Gateway detect a spike and delay putting the connections through to Lambda? This would allow most to be fulfilled by re-used hot containers which would not create excessive db connections. I know API Gateway allows throttling but this is very coarse and can't do anything other than drops connections that exceed the limit. Liberal use of MySQL/RDS read-replicas

mysqldjangoaws-lambdaserverless-architectureamazon-rds-proxy

Reputation: 5411

AWS Lambda flooding RDS MySQL connections during spikes

I have Django running on AWS Lambda connecting to MySQL on RDS. Everything works great most of the time.

However, if a spike executes 10000 concurrent requests this spawns many Lambda containers and each opens a database connection which will eventually exceed RDS connection limits. (as per https://serverfault.com/questions/862387/aws-rds-connection-limits)

What is the best strategy (if any) to get this to web-ish scale without losing SQL. Some ideas:

Is there a way to limit the number of AWS containers to encourage/force re-use rather than spawning more? (e.g. Set max lambda containers as some proportion of db connection limit). If container limit is reached the connection could wait until a hot aws container is available.
Can API Gateway detect a spike and delay putting the connections through to Lambda? This would allow most to be fulfilled by re-used hot containers which would not create excessive db connections. I know API Gateway allows throttling but this is very coarse and can't do anything other than drops connections that exceed the limit.
Liberal use of MySQL/RDS read-replicas

Upvotes: 2

Answers (3)

LiriB

Reputation: 822

You can get around the disproportion of concurrent lambda executions to number of MySQL RDS connections by using a RDS proxy starting from December 2019.

What RDS proxy actually does is connection pooling (which Lambda cannot do on it own), note however that there is additional cost associated with the RDS proxy.

Ref: https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/

Upvotes: 4

Bill Karwin

Reputation: 562230

After the AWS conference in November 2017, http://docs.aws.amazon.com/lambda/latest/dg/concurrent-executions.html says:

You can optionally set the concurrent execution limit for a function.

Visit the page for details.

I posted the answer below before the per-function concurrency controls were announced:

You could rely on the AWS max concurrent Lambda executions, then if you have a max connection limit on your RDS that is at least 1000, then it won't be a problem. In other words, AWS will throttle your Lambda concurrency before MySQL on RDS limits you. Depending on the instance size, 1000 is not an unusual value for max connections on MySQL.

If you use the default max connections on RDS, you have to use at least a db.m3.xlarge or db.r3.large (see value of max_connections in AWS RDS). But you can change the max connections in a parameter group (thanks to comment from @Michael-sqlbot for this info).

I have another idea if you want to throttle your own concurrent executions for your specific Django app. Note: I have not tried this, I just thought of it.

Suppose you want no more than 75 concurrent executions.

Create an SQS queue, and fill it with 75 entries. It doesn't matter what values you put in the queue, just the quantity is important.
In your Lambda function, before you open a connection to MySQL, pull one item from the SQS queue. If there's something in the queue, it'll return instantly. If the queue is empty, your Lambda will block until a value is inserted in the queue.
Just before your Lambda function exits, close the MySQL connection, and then push one value back into the SQS queue.

Thus if you have 75 Lambda functions running, the 76th request will have to wait until one of the Lambda functions restores a value back into the queue.

This architecture allows you to control your scale to any max concurrency you want, up to your max of 1000 per region imposed by AWS. Just prime the SQS queue with a different number of items. You can even increase or decrease the throttling without redeploying your Lambda code, just by adding a few or deleting a few items in the queue.

You can also allow different Lambda functions to have their own "pool" because for example you might have different Django apps connecting to their own RDS instances. Just create a new SQS queue per "pool" and make sure different Lambda functions are using the right SQS queue.

I suppose there's a risk that if any Lambda aborts before it restores its value to the SQS queue, you'd see the queue deplete gradually and eventually become empty. Ideally, the length of the queue would shrink and grow cyclically. But if the length plummets toward zero, it could indicate your Lambda code is crashing. You might want to set up some kind of alert for this with CloudWatch.

Upvotes: 4

Mark B

Reputation: 200466

Is there a way to limit the number of AWS containers to encourage/force re-use rather than spawning more?

Can API Gateway detect a spike and delay putting the connections through to Lambda?

Liberal use of MySQL/RDS read-replicas

That's your best option without moving away from relational databases entirely. Or move your Django app back to a more traditional EC2 web server and use database connection pools.

Upvotes: 3

AWS Lambda flooding RDS MySQL connections during spikes

Answers (3)

Related Questions