chrispytoes
chrispytoes

Reputation: 1889

Capacity provider instances not being added to cluster

I'm new to AWS and I'm trying to provision an ECS cluster with a capacity provider via Terraform. My plan executes without errors currently, and I can see that the capacity provider creates my instances, but those instances are not being registered with the cluster, even though the provider can be seen in the cluster's edit page in the web console.

Here is my config for the cluster:

resource "aws_ecs_cluster" "cluster" {
  name = "main"

  depends_on = [
    null_resource.iam_wait
  ]
}

data "aws_ami" "amazon_linux_2" {
  most_recent = true
  owners      = ["amazon"]

  filter {
    name   = "name"
    values = ["amzn2-ami-ecs-hvm-*-x86_64-ebs"]
  }
}

resource "aws_launch_configuration" "cluster" {
  name = "cluster-${aws_ecs_cluster.cluster.name}"
  image_id = data.aws_ami.amazon_linux_2.image_id
  instance_type = "t2.small"

  security_groups = [module.vpc.default_security_group_id]
  iam_instance_profile = aws_iam_instance_profile.cluster.name
}

resource "aws_autoscaling_group" "cluster" {
  name = aws_ecs_cluster.cluster.name
  launch_configuration = aws_launch_configuration.cluster.name
  vpc_zone_identifier = module.vpc.private_subnets

  min_size = 3
  max_size = 3
  desired_capacity = 3

  tag {
    key = "ClusterName"
    value = aws_ecs_cluster.cluster.name
    propagate_at_launch = true
  }

  tag {
    key = "AmazonECSManaged"
    value = ""
    propagate_at_launch = true
  }
}

resource "aws_ecs_capacity_provider" "cluster" {
  name = aws_ecs_cluster.cluster.name

  auto_scaling_group_provider {
    auto_scaling_group_arn = aws_autoscaling_group.cluster.arn

    managed_scaling {
      status = "ENABLED"
      maximum_scaling_step_size = 1
      minimum_scaling_step_size = 1
      target_capacity = 3
    }
  }
}

resource "aws_ecs_cluster_capacity_providers" "cluster" {
  cluster_name = aws_ecs_cluster.cluster.name

  capacity_providers = [aws_ecs_capacity_provider.cluster.name]

  default_capacity_provider_strategy {
    base = 1
    weight = 100
    capacity_provider = aws_ecs_capacity_provider.cluster.name
  }
}

The instance profile role has this policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeTags",
        "ecs:CreateCluster",
        "ecs:DeregisterContainerInstance",
        "ecs:DiscoverPollEndpoint",
        "ecs:Poll",
        "ecs:RegisterContainerInstance",
        "ecs:StartTelemetrySession",
        "ecs:Submit*",
        "ecr:GetAuthorizationToken",
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability",
        "logs:CreateLogStream",
        "logs:PutLogEvents"
      ],
      "Resource": "*"
    }
  ]
}

I've read that this can happen if the instances do not have the proper roles, but as far as I can tell I've set up my roles correctly. I'm not getting any visible permission errors that I can find.

Another strange thing I've seen is that if another cluster named "default" exists, then the instances will register themselves to that cluster, even though the capacity provider is still attached to the other cluster.

Upvotes: 1

Views: 2162

Answers (2)

I just ran into this problem, and adding the cluster name to /etc/ecs/ecs.config by itself did not help. I also had to create the /etc/ecs/ folder before I could add the file. That should have been a clue.

This is because I was not using an ECS-optimised instance, but a regular Debian instance. So you have to install the ecs-agent as an extra step, as described in https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-install.html#ecs-agent-install-nonamazonlinux (you may have to install docker first, like me).

Once I did that manually, the instance showed up immediately in the cluster. I then added the steps to install docker and the ecs-agent to the user_data so every instance launched gets set up correctly.

Upvotes: 0

chrispytoes
chrispytoes

Reputation: 1889

Figured it out! I just had to set user_data like below in my launch configuration.

resource "aws_launch_configuration" "cluster" {
  name = "cluster-${aws_ecs_cluster.cluster.name}"
  image_id = data.aws_ami.amazon_linux_2.image_id
  instance_type = "t2.small"

  security_groups = [module.vpc.default_security_group_id]
  iam_instance_profile = aws_iam_instance_profile.cluster.name

  user_data = "#!/bin/bash\necho ECS_CLUSTER=${aws_ecs_cluster.cluster.name} >> /etc/ecs/ecs.config"
}

Upvotes: 6

Related Questions