Paul Leclercq
Paul Leclercq

Reputation: 1018

How to do a graceful shutdown of a AWS ASG with terraform?

Problem

On a terraform destroy, all ASG resources are terminated and some services (spark streaming in my case) may still have data to process.

To be sure my app shuts down gracefully, I connect to each instances of my ASG to perform a systemctl stop service and I would like to automate this process with Terraform

Leads

I know when="destroy keyword and remote-exec provisioner, but I'm not sure what the recommended way to gracefully shutdown instances in an ASG.

resource "aws_instance" "app" {
  # ...

  provisioner "remote-exec" {
    when    = "destroy"
    inline = [ "systemctl stop service" ]
  }
}

source:

Upvotes: 1

Views: 1856

Answers (1)

ydaetskcoR
ydaetskcoR

Reputation: 56997

You can use autoscaling group lifecycle hooks to prevent the ASG from terminating an instance before the hook is marked as complete.

You can attach a termination lifecycle hook to your ASG using the aws_autoscaling_lifecycle_hook resource:

resource "aws_autoscaling_group" "example" {
  availability_zones   = ["us-west-2a"]
  name                 = "example"
  min_size             = 1
  max_size             = 2
}

resource "aws_autoscaling_lifecycle_hook" "example" {
  name                   = "example"
  autoscaling_group_name = "${aws_autoscaling_group.example.name}"
  default_result         = "CONTINUE"
  heartbeat_timeout      = 300
  lifecycle_transition   = "autoscaling:EC2_INSTANCE_TERMINATING"

  notification_target_arn = "arn:aws:sqs:us-west-2:444455556666:queue1*"
  role_arn                = "arn:aws:iam::123456789012:role/S3Access"
}

The above example will make the ASG wait for 5 minutes (300 seconds) after marking the instance ready to terminate. Once the lifecycle hook is triggered by the ASG attempting to terminate the instance it will send a notification to the notification_target_arn which can be either an SQS queue or an SNS topic.

You would then need to handle the notification with something that can perform whatever actions you'd want to do. In your case you might have a small application running on each instance that polls the SQS queue looking for a termination notice for its own instance ID and if it gets that notification it then stops the service. Alternatively you could have an SNS topic trigger a Lambda function to perform some action.

Once the action has been completed you will then need to mark the lifecycle hook as complete by calling the AWS API with the relevant information. Or you can wait for the timeout and allow the ASG to continue with the termination as per the default_result parameter on the aws_authscaling_lifecycle_hook resource.

Upvotes: 1

Related Questions