Reputation: 5824
I have a service which is deployed on Amazon Web Services (AWS), specifically 2 instances behind an Elastic Load Balancer (ELB). Availability zones are selected as all three us-west-2a,b,c but only 2 of the above 3 zones have instances running in it.
The issues is that even though the traffic/load is not too high but I still get HTTP 504 errors from ELB often enough.
The log lines reads like this
-1 -1 -1 504 0 0 0
In order, --request_processing_time --backend_processing_time --response_processing_time --elb_status_code --backend_status_code --received_bytes --sent_bytes. Description of what each field and response means can be found here
ELB idle timeout is 60 seconds. KeepAlive
is 'On' on backend instances. Latency of requests from ELB are in check. I have tried increasing KeepAliveTimeout
but to no avail.
Does anyone have any idea about how to proceed? I don't even know the root cause of this issue.
PS: More like a second question, there are a few cases (much less than 504 being returned by ELB when backend does not even accept the request) where even backend is returning a 504 and then ELB is forwarding the same to client. To the best of my knowledge HTTP 504 should be returned by a proxy only when backend is timing out. How can a server itself return a 504?
Upvotes: 11
Views: 4348
Reputation: 5824
So that it might assist others in future, I am publishing my finding(s) here:
1) This 504 0 HTTP errors were mainly because of logrotate reloading apache instead of graceful restart. The current AWS config does the following
/sbin/service httpd reload > /dev/null 2>/dev/null || true
so replace the service command with either apachectl -k graceful
or /sbin/service httpd graceful
File location on my ec2 instance: /etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.httpd.conf
2) Because logrotate frequency was too high by default in AWS (once every hour), at least for my use case, and that in turn was reloading apache every hour, so I reduced that as well.
Upvotes: 7
Reputation: 7116
When backend connection timeout, ELB will put -1 to backend_processing_time column in its access log. Think what's happening is that some of your requests take longer than 60s for your backend to process. To confirm this, can you check your latency metrics? Please switch to maximum when viewing this metric. It will confirm my guess if you see latency frequently reaches 60s.
After it got confirmed, you might want to increase Idle timeout of your ELB and backend.
Upvotes: 0