Archimedes Trajano
Archimedes Trajano

Reputation: 41328

How do you make Terraform wait for cloud-init to finish?

In my Terraform AWS Docker Swarm module I use cloud-init to initialize the EC2 instance. However, Terraform says the resource is ready before cloud-init finishes. Is there a way of making it wait for cloud-init to finish, ideally without SSHing or checking for a port to be up using a null resource?

Upvotes: 16

Views: 12162

Answers (2)

rpadovani
rpadovani

Reputation: 7360

Another possible approach is using AWS Systems Manager Run Command, if available on your AMI.

You create an SSM Document with Terraform that uses the cloud-init status --wait command, then you trigger the command from a local provisioner, and wait for it to complete. In this way, you don't have to play around with tags, and you are 100% sure cloud-init has been completed.

This is an example of the document you can create with Terraform:

resource "aws_ssm_document" "cloud_init_wait" {
  name = "cloud-init-wait"
  document_type = "Command"
  document_format = "YAML"
  content = <<-DOC
    schemaVersion: '2.2'
    description: Wait for cloud init to finish
    mainSteps:
    - action: aws:runShellScript
      name: StopOnLinux
      precondition:
        StringEquals:
        - platformType
        - Linux
      inputs:
        runCommand:
        - cloud-init status --wait
    DOC
}

and then you can use a local-provisioner inside the EC2 instance block, or in a null resource, up to what you have to do with it.

The provisioner would be more or less like this:

provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]

    command = <<-EOF
    set -Ee -o pipefail
    export AWS_DEFAULT_REGION=${data.aws_region.current.name}

    command_id=$(aws ssm send-command --document-name ${aws_ssm_document.cloud_init_wait.arn} --instance-ids ${self.id} --output text --query "Command.CommandId")
    if ! aws ssm wait command-executed --command-id $command_id --instance-id ${self.id}; then
      echo "Failed to start services on instance ${self.id}!";
      echo "stdout:";
      aws ssm get-command-invocation --command-id $command_id --instance-id ${self.id} --query StandardOutputContent;
      echo "stderr:";
      aws ssm get-command-invocation --command-id $command_id --instance-id ${self.id} --query StandardErrorContent;
      exit 1;
    fi;
    echo "Services started successfully on the new instance with id ${self.id}!"

    EOF
  }

Upvotes: 3

Alain O&#39;Dea
Alain O&#39;Dea

Reputation: 21696

Your managers and workers both use template_cloudinit_config. They also have ec2:CreateTags.

You can use an EC2 resource tag like trajano/terraform-docker-swarm-aws/cloudinit-complete to indicate that the cloudinit has finished.

You could add this final part to each to invoke a tagging script:

part { filename = "tag_complete.sh" content = local.tag_complete_script content_type = "text/x-shellscript" }

And declare tag_complete_script be the following:

locals {
  tag_complete_script = <<-EOF
  #!/bin/bash
  instance_id="${TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` \
&& curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/instance-id}"
  aws ec2 create-tags --resources "$instance_id" --tags 'Key=trajano/terraform-docker-swarm-aws/cloudinit-complete,Value=true'
  EOF
}

Then with a null_resource, you wait for the tag to appear (wrote this on my phone, so use it for a general idea, but I don't expect that it will work without testing and edits):

resource "null_resource" "wait_for_cloudinit" {
  provisioner "local-exec" {
    command = <<-EOF
    #!/bin/bash
    poll_tags="aws ec2 describe-tags --filters 'Name=resource-id,Values=${join(",", aws_instance.managers[*].id)}' 'Name=key,Values=trajano/terraform-docker-swarm-aws/cloudinit-complete' --output text --query 'Tags[*].Value'"
    expected='${join(",", formatlist("true", aws_instance.managers[*].id))}'
    $tags="$($poll_tags)"
    while [[ "$tags" != "$expected" ]] ; do
      $tags="$($poll_tags)"
    done
    EOF
  }
}

This way you can have dependencies on null_resource.wait_for_cloudinit on any resources that need to run after cloudinit has completed.

Upvotes: 8

Related Questions