Legion_of_boom__
Legion_of_boom__

Reputation: 99

How can I find out why my health checks are failing?

My instances keep failing their ELB health checks and I can't find any information on why that's happening. I go to the target group in the console and under 'targets' the only information I get is that the health check status is 'unhealthy' and the 'health status details' just say 'health checks failed'. How can I find the real reason my health checks are failing? Here's my Terraform code as well that includes my load balancer, auto scaling group, listener and target group

main.tf

resource "aws_lb" "jira-alb" {
  name               = "jira-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.jira_clb_sg.id]
  subnets            = [var.public_subnet_ids[0], var.public_subnet_ids[1]]

  enable_deletion_protection = false

  access_logs {
    bucket   = aws_s3_bucket.this.id
    enabled  = true
  }

  tags = {
    Environment = "production"
  }



}

resource "aws_lb_target_group" "jira" {
  name     = "jira-tg"
  port     = 80
  protocol = "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    enabled = true
    healthy_threshold = 10
    unhealthy_threshold = 5
    interval = 30
    timeout = 5
    path = "/index.html"
  }

stickiness {
  type = "lb_cookie"
  cookie_duration = 1 ## CANT BE 0.. RANGES FROM 1-604800
}
}

resource "aws_lb_listener" "jira-listener" {

  port            = 443
  protocol        = "HTTPS"
  ssl_policy      = "ELBSecurityPolicy-TLS-1-2-2017-01"
  load_balancer_arn = aws_lb.jira-alb.arn
  certificate_arn = data.aws_acm_certificate.this.arn ##TODO Change to a variable

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.jira.arn
  }

}

resource "aws_autoscaling_group" "this" {
  vpc_zone_identifier       = var.subnet_ids
  health_check_grace_period = 300
  health_check_type         = "ELB"
  force_delete              = true
  desired_capacity          = 2
  max_size                  = 2
  min_size                  = 2
  target_group_arns = [aws_lb_target_group.jira.arn]


  timeouts {
    delete = "15m"
  }


  launch_template {
    id      = aws_launch_template.this.id
    # version = "$Latest"
    version = aws_launch_template.this.latest_version
  }

  instance_refresh {
    strategy = "Rolling"
    preferences {
      min_healthy_percentage = 50
    }
  }
}

I was expecting my health checks to pass and my instances to stay running, but they keep failing and getting re-deployed

Also here are the security groups for my load balancer and my auto-scaling group

security_groups.tf

resource "aws_security_group" "jira_clb_sg" {
  description = "Allow-Veracode-approved-IPs from external to elb"
  vpc_id      = var.vpc_id

  tags = {
    Name      = "public-elb-sg-for-jira"
    Project   = "Jira Module"
    ManagedBy = "terraform"
  }

  ingress {

    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = var.veracode_ips

  }

  egress {

    from_port   = 0
    to_port     = 0
    protocol    = -1
    cidr_blocks = ["0.0.0.0/0"]

  }

}

resource "aws_security_group" "jira_sg" {
  description = "Allow-Traffic-From-CLB"
  vpc_id      = var.vpc_id

  tags = {
    Name      = "allow-jira-public-clb-sg"
    Project   = "Jira Module"
    ManagedBy = "terraform"
  }

  ingress {

    from_port       = 0
    to_port         = 0
    protocol        = -1
    security_groups = [aws_security_group.jira_clb_sg.id]

  }

  egress {

    from_port   = 0
    to_port     = 0
    protocol    = -1
    cidr_blocks = ["0.0.0.0/0"]

  }


}

My load balancer lets in traffic from port 443 and my auto scaling group allows traffic on any port from the load balancer security group

Upvotes: 0

Views: 1215

Answers (1)

Leo
Leo

Reputation: 538

Your health check is on port 80, your security groups only open port 443.

As described in the Official documentation

"You must ensure that your load balancer can communicate with registered targets on both the listener port and the health check port. Whenever you add a listener to your load balancer or update the health check port for a target group used by the load balancer to route requests, you must verify that the security groups associated with the load balancer allow traffic on the new port in both directions"

Upvotes: 0

Related Questions