Laurent Jalbert Simard
Laurent Jalbert Simard

Reputation: 6349

AWS Network Load Balancer doesn't allow traffic to its source instance from it source instance

I have an ECS cluster consisting of 2 instances in different AZ. One of the many services I run is a SMTP relay. I want to use a Network Load Balancer in front of this service to easily configure other applications to use the relay.

Once I got everything in place, I faced the following issue:

If the container is present on instance 'A' only instance 'B' is able to access it and vice versa, otherwise it times out. So the Network Load Balancer seems to prevent access to a service that lives on the same instance.

Is there something I'm missing here? Is anyone aware of this and have a workaround?

Update: When scaling the service to 2 instances it started to work. I now tend to believe it's related to the Availability Zones.

Upvotes: 6

Views: 7526

Answers (4)

Ajith Kumar
Ajith Kumar

Reputation: 1

Working fine for me in below Network load balancer scenario. I have 2 servers in AWS and its using for NLB. DNS added in Cloudflare and proxy also enbled Using self sign certificate in my servers We should add ElasticLoadBalancingFullAccess permission for the servers

Upvotes: 0

Santhosh Kumar A
Santhosh Kumar A

Reputation: 98

This is due to hairpinning. The connection failure on NLB happens only when the source IP and the target IP are the same.

To solve this problem, Activate preserve client IP addresses attributes in target group.

Resolution steps:

  1. Open the Amazon EC2 console.
  2. Goto Target Groups.
  3. Select the name of the target group and open the detail section.
  4. On the Attributes tab, choose Edit.
  5. Under traffic configuration, disable Preserve client IP addresses.
  6. Save changes.

refer the screenshot for your reference

Upvotes: 1

James
James

Reputation: 553

Solution If you would like to keep containers on the same instance and use NLB you need to use "awsvpc" networkMode in your task definition and change target group type to "ip"(not by instance ID).

Explanation NLB doesn't support hairpinning of requests. When you register targets by instance ID, the source IP addresses of clients are preserved. When you try to connect to the NLB from the backend a loopback is created and this is not allowed by the NLB as the source and destination address is the same and the connection times out. If an instance is a client of an internal load balancer that is registered by instance ID, the connection succeeds only if the request is routed to a different instance.

Some extra info: https://aws.amazon.com/premiumsupport/knowledge-center/target-connection-fails-load-balancer/

Upvotes: 6

Camil Sumodhee
Camil Sumodhee

Reputation: 81

I experienced a similar issue.

Here is my setup:

  • A VPC spread over 3 AZ.
  • 3 public subnets (one in each AZ)
  • 1 instance in a public subnet in AZ-a
  • 3 private subnets (one in each AZ)
  • 1 NLB spread over the 3 private subnets.
  • A cluster of ECS instances. 1 instance in each private subnet. (instance-a in AZ-a, instance-b in AZ-b, instance-c in AZ-c)
  • A service running on each instance ; in total 3 healthy services spread over the 3 private subnets registered to the NLB.
  • A route 53 Alias record to map "myservice.example.com" to the NLB DNS name.

Below the tests executed:

Query initiated from an instance in the private subnet."

Test1: From instance-a (in AZ-a), query "myservice.example.com".

Result1: The query hits the NLB on one of its private IP. If the IP is in the same subnet as instance-a, the query will time-out. If the IP is in a different subnet, the query will succeed.

Test2: Same as Test1 but query from instance-b (in AZ-b).

Result2: The query hits the NLB on one of its private IP. If the IP is in the same subnet as instance-b, the query will time-out. If the IP is in a different subnet, the query will succeed.

Similar result with a query initiated from instance-c.

Query initiated from an instance in a public subnet AZ-a

Test3: From the instance in public subnet in AZ-a, query "myservice.example.com".

Result3: The query hits the NLB on one of its private IP. The query always succeeds, regardless of which private IP was hit.

Query initiated from an extra instance (instance-a2) in private subnet AZ-a

Test4: I have launched an additional instance (instance-a2) in the private subnet in AZ-a. Then, from instance-a2, query to "myservice.example.com". IMPORTANT: This instance does not run any service an therefore can never be selected by the NLB to route any request.

Result4: The query succeeds all the time! Even when hitting a target that is in the private subnet A (same subnet as instance-a2).

Conclusions:

  • With Test1 and Test2, I could experience the same issue as Laurent Jalber Simard when querying from an instance that was hosting the target service.
  • Per as Test3, the issue does not seem to come from requests coming from the same AZ as the target service.
  • With Test4, it appears that the issue cannot be reproduced if the query comes from an instance that is different from the instance hosting the target service ; even if they are in the same subnet.

Therefore, my conclusion so far is that the NLB will timeout if the source ip of the request and the destination ip of the target selected by the NLB is the same.

I couldn't find this issue/limitation documented in AWS NLB docs and so far nothing comes up in a Google search. Is there anybody outhere reaching to the same conclusion?

Upvotes: 8

Related Questions