Reputation: 572
I'm having an issue where ELB gives me a timeout when connecting to a service on the same EC2 instance.
I have an ECS cluster with two EC2 instances (launched through the ECS wizard). I'm currently running two services: a RabbitMQ queue, and two Celery workers. I put an internal ELB network load balancer in front of the RabbitMQ container.
The celery worker on the other EC2 instance can connect without issues, but the worker that's on the same host as the RabbitMQ container can't connect:
[2018-01-24 12:00:55,128: ERROR/MainProcess] consumer: Cannot connect to amqp://user:**@rabbitmq-abcdefghijklmnop.elb.eu-central-1.amazonaws.com:5672//: timed out.
I've checked the flow logs for the VPC, and all packages are accepted (.157 being the EC2 instance, .136 the ELB):
Upvotes: 0
Views: 660
Reputation: 179054
A network load balancer presents the connection to the server as though it came from the client machine's IP address. Replies are magically mangled back to the correct address/port pairs by the network infrastructure.
But when the server tries to reply, it replies to that source address... and in your configuration, that source address is that same machine... which didn't try to connect to itself, it tried to connect to a different machine... so the forward path and return path source and destination address/port pairs don't correlate correctly and the connection times out.
This appears to be a limitation in Network Load Balancer. Any similarly-designed layer 3 balancer would have the same limitation.
See also https://forums.aws.amazon.com/thread.jspa?messageID=805583󄫏
Upvotes: 3