Konrad Malik
Konrad Malik

Reputation: 76

NGINX on Docker Swarm to serve multiple applicaions on the same port

I know that similar questions have been asked, but none of the topics, articles and blogs that I found allowed me to resolve my issue. Let me be very straightforward and specific here:

1. What I have:

Docker Swarm cluster (1 local node), NGINX as a reverse proxy, and for the sake of this example: apache, spark, rstudio and jupyter notebook containers.

2. What I want:

I want to set up NGINX to that I can expose to the host only one port (80 - NGINX) and serve these 4 applications through NGINX over the same port (80) but different paths. On my local dev environment I want apache to be accesible on "127.0.0.1/apache", rstudio under "127.0.0.1/rstudio", spark UI under "127.0.0.1/spark" and jupyter under "127.0.0.1/jupyter". All these applications use different ports internally, this is not a problem (apache - 80, spark - 8080, rstudio - 8787, jupyter - 8888). I want them to use the same port externally, on the host.

3. What I don't have:

I don't have and won't have a domain name. My stack should be able to work when all I have is a public IP to the server or multiple servers that I own. No domain name. I saw multiple examples on how to do things that I want to do using hostnames, I don't want that. I want to acces my stack only by IP and path, for example 123.123.123.123/jupyter.

4. What I came up with:

And now to my actual problem - I have a partialy working solution. Concretely, apache and rstudio are working ok, jupyter and spark are not. By not I mean that jupyter redirections are causing problems. When I go to 127.0.0.1/jupyter I am being redirected to the login page, but instead of redirecting to 127.0.0.1/jupyter/tree, it redirects me to 127.0.0.1/tree, which of course does not exist. Spark UI won't render properly, beacuse all css and js files are under 127.0.0.1/spark/some.css, but spark UI tries to get them from 127.0.0.1/some.css and the same story is basically with all other dashboards

In my actual stack I have more services like hue, kafdrop etc. and none of them work. Actually the only things that work are apache, tomcat and rstudio. I'm suprised that rstudio works without problems with authentication, logging in, out etc. It is completely ok. I actually have no idea why it works, when everything else fails.

I tried to do the same with Traefik - same outcome. With traefik I could not even set up rstudio, all dashboards suffered the same problem - not properly loading static content, or dashboards with login page - bad redirects.

5. Questions:

So my questions are:

My minimal working example is below: First initialize swarm and create network:

docker swarm init


docker network create -d overlay --attachable bigdata-net

docker-compose.yml

version: '3'

services:
    nginx:
        image: nginx:alpine
        volumes:
            - ./nginx.conf:/etc/nginx/nginx.conf:ro
        ports:
            - 80:80
            - 443:443
        networks:
            - bigdata-net
        deploy:
            mode: replicated
            replicas: 1
            restart_policy:
                condition: any

    apache:
        image: httpd:alpine
        networks:
            - bigdata-net
        deploy:
            mode: replicated
            replicas: 1
            restart_policy:
                condition: any

    rstudio:
        image: rocker/rstudio:3.5.2
        networks:
            - bigdata-net
        environment:
            - PASSWORD=admin
        deploy:
            mode: replicated
            replicas: 1
            restart_policy:
                condition: any

    jupyter:
        image: jupyter/all-spark-notebook:latest
        networks:
            - bigdata-net
        deploy:
            mode: replicated
            replicas: 1
            restart_policy:
                condition: any

    spark:
        image: bde2020/spark-master:2.2.1-hadoop2.7
        networks:
            - bigdata-net
        deploy:
            mode: replicated
            replicas: 1
            restart_policy:
                condition: on-failure

nginx.conf

worker_processes auto;

events {
    worker_connections 1024; 
}

http {

    log_format compression '$remote_addr - $remote_user [$time_local] '
        '"$request" $status $upstream_addr '
        '"$http_referer" "$http_user_agent" "$gzip_ratio"';

    server {
        listen 80;
        listen [::]:80;
        access_log /var/log/nginx/access.log compression;

        ######### APACHE
        location = /apache { # without this only URL with tailing slash would work
            return 301 /apache/;
        }

        location /apache/ {
            set $upstream_endpoint apache:80;
            resolver 127.0.0.11 valid=5s;
            rewrite ^/apache(/.*) $1 break;
            proxy_pass $scheme://$upstream_endpoint;
            proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/apache/;
        }

        ######### RSTUDIO
        location = /rstudio { # without this only URL with tailing slash would work
            return 301 /rstudio/;
        }

        location /rstudio/ {
            set $upstream_endpoint rstudio:8787;
            resolver 127.0.0.11 valid=5s;
            rewrite ^/rstudio(/.*) $1 break;
            proxy_pass $scheme://$upstream_endpoint;
            proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/rstudio/;
        }

        ######### JUPYTER
        location = /jupyter { # without this only URL with tailing slash would work
            return 301 /jupyter/;
        }

        location /jupyter/ {
            set $upstream_endpoint jupyter:8888;
            resolver 127.0.0.11 valid=5s;
            rewrite ^/jupyter(/.*) $1 break;
            proxy_pass $scheme://$upstream_endpoint;
            proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/jupyter/;
        }

        ######### SPARK
        location = /spark { # without this only URL with tailing slash would work
            return 301 /spark/;
        }

        location /spark/ {
            set $upstream_endpoint spark:8080;
            resolver 127.0.0.11 valid=5s;
            rewrite ^/spark(/.*) $1 break;
            proxy_pass $scheme://$upstream_endpoint;
            proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/spark/;
        }
    }
}

Also, materials based on which I created and modified this config: https://medium.com/@joatmon08/using-containers-to-learn-nginx-reverse-proxy-6be8ac75a757 https://community.rstudio.com/t/running-shinyapps-from-rstudio-server-behind-nginx-proxy/17706/4

I hope someone can help me, I have trobles sleeping since I cannot resolve this issue ;)

Upvotes: 0

Views: 2554

Answers (1)

Dmitrii
Dmitrii

Reputation: 917

I can't help with Jupyter and Spark but hope that this answer will help you.

If you plan to put something behind a reverse proxy, you should verify that it can work behind a reverse proxy, as you mentioned.

127.0.0.1/jupyter/tree, it redirects me to 127.0.0.1/tree

because for Jupyter root is /, not /jupyter, so you need to find in config how to change it, as an example for Grafana.

# The full public facing url you use in browser, used for redirects and emails
# If you use reverse proxy and sub path specify full url (with sub path)
root_url = https://example.com/grafana

NGINX config can be simplified, look into this example:

nginx config

# /etc/nginx/conf.d/default.conf

server {
    listen 8080 default_server;

    location / {
        proxy_pass     http://echo:8080/;

        proxy_set_header X-Real-IP           $remote_addr;
        proxy_set_header X-Forwarded-Host    $host;
        proxy_set_header X-Forwarded-Port    $server_port;
        proxy_set_header X-Forwarded-Proto   $scheme;
        proxy_set_header X-Forwarded-Request $request;
        proxy_set_header X-Forwarded-Agent   $http_user_agent;
    }

    location ~ /echo([0-9]+)/ {
        rewrite ^/echo([0-9]+)(.*)$ $2 break;
        proxy_pass     http://echo:8080;

        proxy_set_header X-Real-IP           $remote_addr;
        proxy_set_header X-Forwarded-Host    $host;
        proxy_set_header X-Forwarded-Port    $server_port;
        proxy_set_header X-Forwarded-Proto   $scheme;
        proxy_set_header X-Forwarded-Request $request;
        proxy_set_header X-Forwarded-Agent   $http_user_agent;
    }
}

docker-compose

version: "3.2"

services:
    nginx:
        image: nginx:alpine
        ports:
            - '8080:8080'
        volumes:
            - ./default.conf:/etc/nginx/conf.d/default.conf

    echo:
        image: caa06d9c/echo

test

$ curl -L localhost:8080/echo1/

{
    "method": "GET",
    "path": "/",
    "ip": "172.31.0.1",
    "headers": {
        "X-Forwarded-Host": "localhost",
        "X-Forwarded-Port": "8080",
        "X-Forwarded-Proto": "http",
        "X-Forwarded-Agent": "curl/7.54.0",
        "X-Forwarded-Request": "GET /echo1/ HTTP/1.1"
    }
}

remarks

variables

proxy_set_header  Host              $http_host;
proxy_set_header  X-Real-IP         $remote_addr;
proxy_set_header  X-Forwarded-For   $proxy_add_x_forwarded_for;
proxy_set_header  X-Forwarded-Proto $scheme;

should be put into location only if soft requires it, and these names, like X-Real-IP can be different, you need to verify it with soft requirements.

You don't need

rewrite ^/rstudio(/.*) $1 break;

because nginx follow correct rules automatically, you need rewrite rule for paths like /path, to cut off path, so it will be / (or something else)

resolver 127.0.0.11 valid=5s;

because you use localhost

set $upstream_endpoint jupyter:8888;

because of proxy_pass.

proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/jupyter/;

because of proxy_pass.

everything else looks good.

Upvotes: 1

Related Questions