Reputation: 99
My instances keep failing their ELB health checks and I can't find any information on why that's happening. I go to the target group in the console and under 'targets' the only information I get is that the health check status is 'unhealthy' and the 'health status details' just say 'health checks failed'. How can I find the real reason my health checks are failing? Here's my Terraform code as well that includes my load balancer, auto scaling group, listener and target group
main.tf
resource "aws_lb" "jira-alb" {
name = "jira-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.jira_clb_sg.id]
subnets = [var.public_subnet_ids[0], var.public_subnet_ids[1]]
enable_deletion_protection = false
access_logs {
bucket = aws_s3_bucket.this.id
enabled = true
}
tags = {
Environment = "production"
}
}
resource "aws_lb_target_group" "jira" {
name = "jira-tg"
port = 80
protocol = "HTTP"
vpc_id = var.vpc_id
health_check {
enabled = true
healthy_threshold = 10
unhealthy_threshold = 5
interval = 30
timeout = 5
path = "/index.html"
}
stickiness {
type = "lb_cookie"
cookie_duration = 1 ## CANT BE 0.. RANGES FROM 1-604800
}
}
resource "aws_lb_listener" "jira-listener" {
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS-1-2-2017-01"
load_balancer_arn = aws_lb.jira-alb.arn
certificate_arn = data.aws_acm_certificate.this.arn ##TODO Change to a variable
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.jira.arn
}
}
resource "aws_autoscaling_group" "this" {
vpc_zone_identifier = var.subnet_ids
health_check_grace_period = 300
health_check_type = "ELB"
force_delete = true
desired_capacity = 2
max_size = 2
min_size = 2
target_group_arns = [aws_lb_target_group.jira.arn]
timeouts {
delete = "15m"
}
launch_template {
id = aws_launch_template.this.id
# version = "$Latest"
version = aws_launch_template.this.latest_version
}
instance_refresh {
strategy = "Rolling"
preferences {
min_healthy_percentage = 50
}
}
}
I was expecting my health checks to pass and my instances to stay running, but they keep failing and getting re-deployed
Also here are the security groups for my load balancer and my auto-scaling group
security_groups.tf
resource "aws_security_group" "jira_clb_sg" {
description = "Allow-Veracode-approved-IPs from external to elb"
vpc_id = var.vpc_id
tags = {
Name = "public-elb-sg-for-jira"
Project = "Jira Module"
ManagedBy = "terraform"
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = var.veracode_ips
}
egress {
from_port = 0
to_port = 0
protocol = -1
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "jira_sg" {
description = "Allow-Traffic-From-CLB"
vpc_id = var.vpc_id
tags = {
Name = "allow-jira-public-clb-sg"
Project = "Jira Module"
ManagedBy = "terraform"
}
ingress {
from_port = 0
to_port = 0
protocol = -1
security_groups = [aws_security_group.jira_clb_sg.id]
}
egress {
from_port = 0
to_port = 0
protocol = -1
cidr_blocks = ["0.0.0.0/0"]
}
}
My load balancer lets in traffic from port 443 and my auto scaling group allows traffic on any port from the load balancer security group
Upvotes: 0
Views: 1215
Reputation: 538
Your health check is on port 80, your security groups only open port 443.
As described in the Official documentation
"You must ensure that your load balancer can communicate with registered targets on both the listener port and the health check port. Whenever you add a listener to your load balancer or update the health check port for a target group used by the load balancer to route requests, you must verify that the security groups associated with the load balancer allow traffic on the new port in both directions"
Upvotes: 0