504s from nginx on EC2 running Node.js causing 503s at ELB

Question

Issue

I have a Node.js app running on 6 EC2 instances with nginx, all of them behind ELB. I've been getting an increase of 504 Gateway Time-out errors from nginx on the EC2 instances, which results in unhealthy hosts that are taken out of service from the ELB, which eventually causes the ELB to return 503 Service Unavailable: Back-end server is at capacity.

Question

The increases in 504s from nginx in the EC2 instances is likely due to slow queries or an increase in throughput, which is obviously the priority to fix here, but the main question I'm posing here is:

What is the optimal timeout config for nginx, ELB, etc to keep them all working together nicely and prevent these domino effects that take down the ELB?

Most of the solutions I've come across deal more with Apache or PHP settings, or I'm unsure if the nginx settings I'm finding really apply to my current setup (should I care about fastcgi or proxy settings?).

Current Config

Here is a breakdown of my current config, any other guidance would be much appreciated.

In nginx.conf, I have this:

http {
    ...
    keepalive_timeout 95;
    ...
}

Amazon says to "make sure that the value you set for the keep-alive time is greater than the idle timeout setting on your load balancer", so I'm covered here, since the ELB Idle Timeout is set to 90 seconds. Not sure if I should be using more settings in nginx.conf to not rely on defaults or also look elsewhere for other non-defaults.

I'm also using defaults in Node.js which I believe has a 120000 ms request timeout.

ELB has the following Connection Settings:

Idle Timeout: 90 seconds

ELB has the following Health Check settings:

Ping Protocol: HTTP
Timeout: 59 seconds
Interval: 60 seconds
Unhealthy Threshold: 3
Healthy Threshold: 2

Again, any guidance here is much appreciated.

504s from nginx on EC2 running Node.js causing 503s at ELB

Answers (1)

Related Questions