Reputation: 1263
To speed up Lambda execution, I am trying to move some parts of my Python code outside the handler function
As per Lambda's documentation:
After a Lambda function is executed, AWS Lambda maintains the Execution Context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the Execution Context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again. This Execution Context reuse approach has the following implications:
Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations…
Following their example, I have moved my database connection logic outside the handler function so subsequent WARM runs of the function can re-use the connection instead of creating a new one each time the function executes.
However, AWS Lambda provides no guarantees that all subsequent invocations of a function that started COLD will run warm so if Lambda decides a COLD start is necessary, my code would re-create the database connection.
When this happens, I assume the previous (WARM) instance of my function that Lambda teared down would have had an active connection to the database which was never closed and if the pattern kept repeating, I suspect I'd have a lot of orphaned DB connections.
Is there a way in Python to detect if Lambda is trying to kill my function instance (maybe they send a SIGTERM signal?) and have it close active DB connections?
The database I'm using is Postgres.
Upvotes: 20
Views: 6266
Reputation: 37075
The accepted answer is no longer correct, it might've been in the past, but today your lambda should be receiving a SIGTERM
when AWS intends to terminate.
AWS has official examples on handling graceful shutdowns in python and other languages here:
https://github.com/aws-samples/graceful-shutdown-with-aws-lambda/tree/main/python-demo
But effectively you do:
import signal
def exit_gracefully(signum, frame):
print('SIGTERM RECEIVED')
signal.signal(signal.SIGTERM, exit_gracefully)
This gets called on container shutdown, and you have 300ms to do cleanup.
Upvotes: 15
Reputation: 12504
I dont think what you are looking for is possible at the moment. Hacks might work but I will advice not to depend on them as undocumented things can stop working at any point in time without notice in a closed source system.
I guess you are concerned about the number of new connection created by your lambda functions and the load it puts on the db server.
Have you seen pgbouncer (https://pgbouncer.github.io/) it is one of the famous connection poolers for postgres. I would recommend using something like pgbouncer in between your lambda function and db.
This will remove the load on your db server caused by creation of new connection as connections between pgbouncer and postgres can remain for a long time. The lambda functions can make new connection to pgbouncer which is more than capable of handling un-closed connections with the various timeout config settings.
Update on 9th Dec 2019
AWS recently announced RDS Proxy capable of connection pooling. Currently its in preview and has no support for postresql but they say its coming soon.
https://aws.amazon.com/rds/proxy/
https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/
Upvotes: 3
Reputation: 4486
I haven't time to test this, but how about trap - I'm AFK at the moment but when I get in I'll edit this answer after some experimentation?
FYI I don't know what signals are sent when a container gets killed, it's not something I looked at, so this answer is based on them being decommissioned in the same way a normal Linux machine goes down.
In your handler you'd add a shell command that runs this script, and then set a variable which will remain in place while the container is being re-used - I'm not a python guy but your logic would go something like this:
Handler
const { exec } = require('child_process');
if(typeof isNewContainer === 'undefined'){
const isNewContainer = true
// run a shell script, in javascript we use shell exec and
// then have a callback for when it exits so the execution is non blocking and allows
// the code below to execute.
exec('./script.sh & sleep 1 && kill -- -$(pgrep script.sh)', (err, stdout, stderr) => {
// close db connections
}
}
// handle the request
Shell script based on this answer:
#!/bin/bash
exitCallback() {
trap - SIGTERM # clear the trap
kill -- -$$ # Sends SIGTERM to child/sub processes
}
trap exitCallback SIGTERM
sleep infinity
Make sure you have a read of the comments on the accepted answer for that question as it gives you the shell commands to run the script.
I would say it's pretty easy to keep containers warm but your question was "Is there a way in Python to detect if Lambda is trying to kill my function instance (maybe they send a SIGTERM signal?) and have it close active DB connections?"
Upvotes: 0
Reputation: 464
I totally agree with @dudemullet.
Currently there is no way you can surely say when a lambda function is going to die. The best approach is to first understand the purpose of your connection. If it is only a simple select/update query that would ideally not take too long to execute, I would suggest you to open and close the connections inside the handler function. This way at least you can be 100% sure that there will not be any orphaned connections
But on the flip side, you might have to bare those few extra milliseconds of the cold start!
Upvotes: 0
Reputation: 410
There is no way to know when a lambda container will be destroyed unfortunately.
With that out of the way, cold boots and DB connections are both very discussed topics using Lambdas. Worst is that there is no definitive answer and should be handled on a use-case basis.
Personally, I think that the best way to go about this is to create connections and kill the idle ones based on a time out postgres side. For that I direct you to How to close idle connections in PostgreSQL automatically?
You might also want to fine tune how many lambdas you have running at any point in time. For this I would recommend setting a concurrency level in your lambda aws-docs. This way you limit the amount of running lambdas and potentially not drown your DB server with connections.
Jeremy Daly(serverless hero) has a great blog post on this. How To: Manage RDS Connections from AWS Lambda Serverless Functions
He also has a project, in node unfortunately, that is a wrapper around the mysql connection. This monitors the connection and automatically manages them like killing zombies serverless-mysql. You might find something similiar for python.
Upvotes: 10