Reputation: 2644
We are receiving random 502 errors from our ALB, our backend does not get hit at all as there are no logs of the request. Nothing from the ALB logs just the 502 error but nothing usable for debugging.
h2 2020-03-26T14:30:52.495547Z app/path/tomytarget 10.111.11.111:50103 100.00.00.00:8080:8080 0.001 18.799 -1 502 - 1213 208 "POST https://mydomain:443/user/auth HTTP/2.0" "Name/3 CFNetwork/1121.2.2 Darwin/19.2.0" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 arn:aws:elasticloadbalancing:ap-southeast-1:0000000000:targetgroup/path/tomytarget "Root=someId" "mydomain.com" "arn:aws:acm:ap-southeast-1:0000000000:certificate/certificatedId" 0 2020-03-26T14:30:33.694000Z "forward" "-" "-" "100.00.00.00:8080" "-"
We started noticing it after we enabled the health checks with a proper route in nodejs and express
app.get("/health-check", (req, res) => {
res.status(200).end();
});
This is our ALB config and we are using VPC peering to another VPC
ElasticLoadBalancer:
Type: AWS::ElasticLoadBalancingV2::LoadBalancer
Properties:
IpAddressType: ipv4
Scheme: internet-facing
SecurityGroups:
- !Ref ELBSecurityGroup
Subnets:
- !Ref PublicSubnetA
- !Ref PublicSubnetB
Type: application
TargetGroup:
Type: AWS::ElasticLoadBalancingV2::TargetGroup
Properties:
Port: 8080
Protocol: HTTP
Targets:
- Id: <some ip in the other VPC>
AvailabilityZone: all
Port: 8080
TargetType: ip
VpcId: !Ref VPC
HealthCheckEnabled: true
HealthCheckIntervalSeconds: 30
HealthCheckPath: /health-check
HealthCheckPort: 8080
HealthCheckProtocol: HTTP
HealthCheckTimeoutSeconds: 5
HealthyThresholdCount: 3
UnhealthyThresholdCount: 5
Listener:
Type: AWS::ElasticLoadBalancingV2::Listener
Properties:
DefaultActions:
- Type: forward
TargetGroupArn: !Ref TargetGroup
LoadBalancerArn: !Ref ElasticLoadBalancer
Certificates:
- CertificateArn: !Ref CertificateArn
Port: 443
Protocol: HTTPS
As I said we are using VPC peering and HTTPS with certificates coming from AWS certificate manager
if you are using nodejs one solution could be the following
// AWS ALB keepAlive is set to 60 seconds, we need to increase the default KeepAlive timeout
// of our node server
server.keepAliveTimeout = 65000; // Ensure all inactive connections are terminated by the ALB, by setting this a few seconds higher than the ALB idle timeout
server.headersTimeout = 66000; // Ensure the headersTimeout is set higher than the keepAliveTimeout due to this nodejs regression bug: https://github.com/nodejs/node/issues/27363
Upvotes: 3
Views: 1876
Reputation: 2644
I have solved like this in nodejs
// AWS ALB keepAlive is set to 60 seconds, we need to increase the default KeepAlive timeout
// of our node server
server.keepAliveTimeout = 65000; // Ensure all inactive connections are terminated by the ALB, by setting this a few seconds higher than the ALB idle timeout
server.headersTimeout = 66000; // Ensure the headersTimeout is set higher than the keepAliveTimeout due to this nodejs regression bug: https://github.com/nodejs/node/issues/27363
Upvotes: 2