Michal Grzelak
Michal Grzelak

Reputation: 321

Nginx internal dns resolve issue

I have nginx container in AWS that does reverse proxy for my website e.g. https://example.com. I have backend services that automatically register in local DNS - aws.local (this is done by AWS ECS Auto-Discovery). The problem I have is that nginx is only resolving name to IP during start, so when service container is rebooted and gets new IP, nginx still tries old IP and I have "502 Bad Gateway" error.

Here is a code that I am running:

worker_processes 1;
events { worker_connections 1024; }
http {
    sendfile on;
    include    /etc/nginx/mime.types;
    log_format  graylog2_json  '{ "timestamp": "$time_iso8601", '
                       '"remote_addr": "$remote_addr", '
                       '"body_bytes_sent": $body_bytes_sent, '
                       '"request_time": $request_time, '
                       '"response_status": $status, '
                       '"request": "$request", '
                       '"request_method": "$request_method", '
                       '"host": "$host",'
                       '"upstream_cache_status": "$upstream_cache_status",'
                       '"upstream_addr": "$upstream_addr",'
                       '"http_x_forwarded_for": "$http_x_forwarded_for",'
                       '"http_referrer": "$http_referer", '
                       '"http_user_agent": "$http_user_agent" }';


    upstream service1 {
        server service1.aws.local:8070;
    }

    upstream service2 {
        server service2.aws.local:8080;
    }

    resolver 10.0.0.2 valid=10s;

    server {
        listen 443 http2 ssl;
        server_name example.com;
        location /main {
            proxy_pass         http://service1;
        }

        location /auth {
            proxy_pass         http://service2;
        }

I find advices to change nginx config to resolve names per request, but then I see my browser tries to open "service2.aws.local:8070" and fails since its AWS local DNS name. I should see https://example.com/auth" on my browser.

server {

        set $main service1.aws.local:2000;
        set $auth service2.aws.local:8070;

        location /main {
            proxy_http_version 1.1;
            proxy_pass http://$main;
        }
        location /auth {
            proxy_http_version 1.1;
            proxy_pass http://$auth;
        }

Can you help me fixing it? Thanks !!!

Upvotes: 2

Views: 8657

Answers (3)

darw
darw

Reputation: 1057

TL;DR

resolver 169.254.169.253;
set $upstream "service1.aws.local";
proxy_pass http://$upstream:8070;

Just like with ECS, I experienced the same issue when using Docker Compose.

According to six8's comment on GitHub

nginx only resolves hostnames on startup. You can use variables with proxy_pass to get it to use the resolver for runtime lookups.

See:

https://forum.nginx.org/read.php?2,215830,215832#msg-215832

https://www.ruby-forum.com/topic/4407628

It's quite annoying.

One of the links above provides an example

resolver 127.0.0.1;
set $backend "foo.example.com";
proxy_pass http://$backend;

The resolver part is necessary. And we can't refer to the defined upstreams here.

According to Ivan Frolov's answer on StackExchange, the resolver's address should be set to 169.254.169.253

Upvotes: 4

GNOKOHEAT
GNOKOHEAT

Reputation: 963

I found perfectly solution of this issue. Nginx "proxy_pass" can't use "etc/hosts" information.

I wanna sugguest you use HA-Proxy reverse proxy in ECS. I tried nginx reverse proxy, but failed. And success with HA-Proxy. It is more simple than nginx configuration.

First, use "links" option of Docker and setting "environment variables" (eg. LINK_APP, LINK_PORT).

Second, fill this "environment variables" into haproxy.cfg.

Also, I recommend you use "dynamic port mapping" to ALB. it makes more flexible works.

taskdef.json :

# taskdef.json

{
    "executionRoleArn": "arn:aws:iam::<AWS_ACCOUNT_ID>:role/<APP_NAME>_ecsTaskExecutionRole",
    "containerDefinitions": [
      {
        "name": "<APP_NAME>-rp",
        "image": "gnokoheat/ecs-reverse-proxy:latest",
        "essential": true,
        "memoryReservation": <MEMORY_RESV>,
        "portMappings": [
          {
            "hostPort": 0,
            "containerPort": 80,
            "protocol": "tcp"
          }
        ],
        "links": [
          "<APP_NAME>"
        ],
        "environment": [
          {
            "name": "LINK_PORT",
            "value": "<SERVICE_PORT>"
          },
          {
            "name": "LINK_APP",
            "value": "<APP_NAME>"
          }
        ]
      },
      {
        "name": "<APP_NAME>",
        "image": "<IMAGE_NAME>",
        "essential": true,
        "memoryReservation": <MEMORY_RESV>,
        "portMappings": [
          {
            "protocol": "tcp",
            "containerPort": <SERVICE_PORT>
          }
        ],
        "environment": [
          {
            "name": "PORT",
            "value": "<SERVICE_PORT>"
          },
          {
            "name": "APP_NAME",
            "value": "<APP_NAME>"
          }
        ]
      }
    ],
    "requiresCompatibilities": [
      "EC2"
    ],
    "networkMode": "bridge",
    "family": "<APP_NAME>"
  }

haproxy.cfg :

# haproxy.cfg

global
    daemon
    pidfile /var/run/haproxy.pid

defaults
    log global
    mode http
    retries 3
    timeout connect 5000
    timeout client 50000
    timeout server 50000

frontend http
    bind *:80

    http-request set-header X-Forwarded-Host %[req.hdr(Host)]

    compression algo gzip
    compression type text/css text/javascript text/plain application/json application/xml

    default_backend app

backend app
    server static "${LINK_APP}":"${LINK_PORT}"

Dockerfile(haproxy) :

FROM haproxy:1.7
USER root
COPY haproxy.cfg /usr/local/etc/haproxy/haproxy.cfg

See :

Github : https://github.com/gnokoheat/ecs-reverse-proxy

Docker image : gnokoheat/ecs-reverse-proxy:latest

Upvotes: 0

yandy
yandy

Reputation: 16

What is the TTL for your CloudMap Service Discovery records? If you do an NS lookup from the NGINX container (assuming EC2 mode and you can exec into the container) does it return the new record? Without more information, it's hard to say, but I'd venture to say this is a TTL issue and not an NGINX/Service Discovery problem.

Lower the TTL to 1 second and see if that works.

AWS CloudMap API Reference DNS Record

Upvotes: 0

Related Questions