Reputation: 51
I'm currently trying to deploy an application onto AWS ECS via CloudFormation templates. A docker image is stored in AWS ECR and deployed into an ECS Service fronted by an Application Load Balancer.
My service starts, and my load balancer is created, but the tasks inside of the ECS service repeatedly fail with the error:
Task failed ELB health checks in (target-group arn:aws:elasticloadbalancing:us-east-1:...
I've checked my security groups - the ECS Service Security Group includes the Load Balancer Security Group, and the Load Balancer is successfully created.
I've manually tried pulling my image on ECR and running it - no issues there. What am I missing? My template is below.
Resources:
ECSRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: ecs-service
PolicyDocument:
Statement:
- Effect: Allow
Action:
# Rules which allow ECS to attach network interfaces to instances
# on your behalf in order for awsvpc networking mode to work right
- 'ec2:AttachNetworkInterface'
- 'ec2:CreateNetworkInterface'
- 'ec2:CreateNetworkInterfacePermission'
- 'ec2:DeleteNetworkInterface'
- 'ec2:DeleteNetworkInterfacePermission'
- 'ec2:Describe*'
- 'ec2:DetachNetworkInterface'
# Rules which allow ECS to update load balancers on your behalf
# with the information sabout how to send traffic to your containers
- 'elasticloadbalancing:DeregisterInstancesFromLoadBalancer'
- 'elasticloadbalancing:DeregisterTargets'
- 'elasticloadbalancing:Describe*'
- 'elasticloadbalancing:RegisterInstancesWithLoadBalancer'
- 'elasticloadbalancing:RegisterTargets'
Resource: '*'
# This is a role which is used by the ECS tasks themselves.
ECSTaskExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Effect: Allow
Principal:
Service: [ecs-tasks.amazonaws.com]
Action: ['sts:AssumeRole']
Path: /
Policies:
- PolicyName: AmazonECSTaskExecutionRolePolicy
PolicyDocument:
Statement:
- Effect: Allow
Action:
# Allow the ECS Tasks to download images from ECR
- 'ecr:GetAuthorizationToken'
- 'ecr:BatchCheckLayerAvailability'
- 'ecr:GetDownloadUrlForLayer'
- 'ecr:BatchGetImage'
# Allow the ECS tasks to upload logs to CloudWatch
- 'logs:CreateLogStream'
- 'logs:PutLogEvents'
Resource: '*'
TaskDef:
Type: AWS::ECS::TaskDefinition
Properties:
Cpu: 4096
Memory: 30720
ContainerDefinitions:
- Image: !Ref ECRImageUrl
Name: !Sub "${ProjectName}-ecsContainer"
PortMappings:
- ContainerPort: 4000
HostPort: 4000
Protocol: tcp
Family: !Sub "${ProjectName}-taskDef"
ExecutionRoleArn: !Ref ECSTaskExecutionRole
RequiresCompatibilities:
- FARGATE
NetworkMode: awsvpc
Cluster:
Type: AWS::ECS::Cluster
Properties:
ClusterName: !Sub "${ProjectName}-ECSCluster"
Service:
Type: AWS::ECS::Service
DependsOn:
- LoadBalancerListener
Properties:
Cluster: !Ref Cluster
DesiredCount: 2
LaunchType: FARGATE
ServiceName: !Sub "${ProjectName}-ECSService"
TaskDefinition: !Ref TaskDef
NetworkConfiguration:
AwsvpcConfiguration:
SecurityGroups:
- !Ref FargateContainerSecurityGroup
AssignPublicIp: ENABLED
Subnets: !Split [',', {'Fn::ImportValue': !Sub '${VPCStackName}-PublicSubnets'}]
LoadBalancers:
- ContainerName: !Sub "${ProjectName}-ecsContainer"
ContainerPort: 4000
TargetGroupArn: !Ref TargetGroup
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
EcsSecurityGroupIngressFromPublicALB:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from the public ALB
GroupId: !Ref 'FargateContainerSecurityGroup'
IpProtocol: -1
SourceSecurityGroupId: !Ref 'PublicLoadBalancerSG'
EcsSecurityGroupIngressFromSelf:
Type: AWS::EC2::SecurityGroupIngress
Properties:
Description: Ingress from other containers in the same security group
GroupId: !Ref 'FargateContainerSecurityGroup'
IpProtocol: -1
SourceSecurityGroupId: !Ref 'FargateContainerSecurityGroup'
PublicLoadBalancerSG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the public facing load balancer
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
SecurityGroupIngress:
- CidrIp: 0.0.0.0/0
IpProtocol: -1
ACMCertificate:
Type: AWS::CertificateManager::Certificate
Properties:
DomainName: !Sub ${ProjectName}.${DomainName}
ValidationMethod: DNS
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
DependsOn:
- LoadBalancer
Properties:
TargetType: ip
Name: !Sub "${ProjectName}-ECSService"
Port: 4000
Protocol: HTTP
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
LoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
Scheme: internet-facing
Subnets: !Split [',', {'Fn::ImportValue': !Sub '${VPCStackName}-PublicSubnets'}]
SecurityGroups:
- !Ref PublicLoadBalancerSG
LoadBalancerListener:
Type: AWS::ElasticLoadBalancingV2::Listener
DependsOn:
- LoadBalancer
Properties:
DefaultActions:
- TargetGroupArn: !Ref TargetGroup
Type: 'forward'
LoadBalancerArn: !Ref LoadBalancer
Port: 443
Protocol: HTTP
Upvotes: 0
Views: 13440
Reputation: 539
After much pain and suffering, I found out that the ALB itself needs to associated with the security group (SG) that allows traffic on the ports that get dynamically allocated by ECS. You should have an SG defined automatically that defines those port ranges. Associate this SG with your ALB and your health checks will start passing (assuming everything else is hooked up correctly).
In addition, ensure that your task definition has network mode set to "bridge" & that the "hostPort" value is set to 0 -- this indicates to ECS to dynamically allocate a port on the underlying EC2 instance and map it to your container port.
Upvotes: 1
Reputation: 51
It turns out that my security groups were not permissive enough. Traffic coming in from a Network Load Balancer is seen as coming from its original source, so if your NLB is open to all traffic, so should your Fargate containers. This fixed my issue:
FargateContainerSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Access to the Fargate containers
VpcId:
Fn::ImportValue:
!Sub '${VPCStackName}-VPC'
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: !Ref ApplicationPort
ToPort: !Ref ApplicationPort
CidrIp: 0.0.0.0/0
Upvotes: 4
Reputation: 5806
Health check feature automatically calls / at port 80 and expect 200 status code in response. Its available in EC2->target groups -> your ecs target group. You have to ensure you port is 4000 and in health check adjust default path and response status code.
Additionally you can always try connecting to your ec2 instance using public ip or DNS on port 4000 that you are using and see if that works.
If ec2 instance not working at port 4000 troubleshoot the docker deployment. There is something wrong with talks definition or parameters.
If it's working something wrong with target group trlargets or health check configuration.
Hope this helps.
Upvotes: 1