Skip
Skip

Reputation: 6531

Internal NLB won't route to instance X when curl to the NLB DNS from same instance X

Problem:
When I am on the instance 10.141.80.140 and curl the DNS of NLB I get no response.
I expect the NLB to redirect to 10.141.80.140 but it doesnt happen.
The NLB DNS only doesnt redirect, when I am on the 10.141.80.140 - the redirection works from other instances in the same subnet

Details:

Question:
Is there something, what prevents NLB to resolve the request of an instance, which would route back to the instance, within the NLB listeners target group?

enter image description here

Upvotes: 6

Views: 1840

Answers (2)

Kishor Unnikrishnan
Kishor Unnikrishnan

Reputation: 2589

Added to the above answer ─ yes, the problem occurs if you sent request from the same instance which is added as a backend for NLB. In short,

Network Load Balancers preserve the source IP, so both the source and destination of the arriving packet are the private IP address of the target. Then, the host operating system sees the packet as not valid so it doesn't send response traffic, and the connection fails.

More details here:-

  1. https://repost.aws/knowledge-center/target-connection-fails-load-balancer
  2. https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#client-ip-preservation

If an instance must send requests to a load balancer that it's registered with, do one of the following:

  1. Disable client IP preservation.[2]

  2. Ensure that containers that must communicate, are on different container instances.

Upvotes: 2

Skip
Skip

Reputation: 6531

that is a well-know behavior that I am going to be glad to explain. Network Load Balancer introduced the source address preservation feature - the original IP addresses and source ports for the incoming connections remains unmodified. When the target answers a request, the VPC internals capture this packet and forwards it to the NLB, which will forward it to its destination.

This behavior has a side effect: when the OS kernel detects that the egress packet has as the destination address one of the local addresses, it will forward this packet directly to the application.

For example, given the following components:

  • We have an internal NLB and a backend instance. Both are deployed in the subnet 10.0.0.0/24.
  • The NLB has the IP 10.0.0.10 and a listener on port 80 that forwards the request to the port 8080.
  • The backend instance has the address 10.0.0.55 and has a web server listening on port 8080. It has a security group that allows all the incoming local traffic.

  • If the instance tries to establish a communication with the NLB; the flow of the communication would be the following:

    • The instance wants to telnet the NLB: it does a request for establish a TCP connection against the NLB DNS name on the port 80.
      • As it is an outgoing communication, it starts from an ephemeral port; the instance sends a SYN packet (1):
        • Source: 10.0.0.55:40000
        • Destination: 10.0.0.10:80
      • The NLB receives the packet and forwards it to the backend instance (10.0.0.55:80).
      • Due the address preservation feature, the backend instance receives a SYN packet with the following information:
        • Source: 10.0.0.55:40000
        • Destination: 10.0.0.55:80
      • The Operation system routes the packet internally (as its destination is the own machine), and here is when the issue happen:
        • The initiating socket is expecting the SYN_ACK from 10.0.0.10:80 (the NLB).
        • However, it receives the SYN_ACK from 10.0.0.55:40000 (the instance itself).
        • The OS will send several TCP_RETRANSMISSION until it times out.

This will not happen with a public NLB, as the instance will need to do NAT in the VPC to use its public IP address to send the request to the NLB. The kernel will not internally forward the packet.

Finally, a possible workaround is registering the backends as per their IP address, not by their Instance ID; with this method, the traffic forwarded by the NLB will contain the NLB internal IP as the source IP, disabling the "source address preservation" feature. Unfortunately, if you are launching instances with an AutoScaling Group, it will only be able to register the launched instances by its ID. In case of ECS tasks, configuring the network as "awsvpc" forces the NLB to register each target by its IP.

Upvotes: 9

Related Questions