Reputation: 76
I know that similar questions have been asked, but none of the topics, articles and blogs that I found allowed me to resolve my issue. Let me be very straightforward and specific here:
1. What I have:
Docker Swarm cluster (1 local node), NGINX as a reverse proxy, and for the sake of this example: apache, spark, rstudio and jupyter notebook containers.
2. What I want:
I want to set up NGINX to that I can expose to the host only one port (80 - NGINX) and serve these 4 applications through NGINX over the same port (80) but different paths. On my local dev environment I want apache to be accesible on "127.0.0.1/apache", rstudio under "127.0.0.1/rstudio", spark UI under "127.0.0.1/spark" and jupyter under "127.0.0.1/jupyter". All these applications use different ports internally, this is not a problem (apache - 80, spark - 8080, rstudio - 8787, jupyter - 8888). I want them to use the same port externally, on the host.
3. What I don't have:
I don't have and won't have a domain name. My stack should be able to work when all I have is a public IP to the server or multiple servers that I own. No domain name. I saw multiple examples on how to do things that I want to do using hostnames, I don't want that. I want to acces my stack only by IP and path, for example 123.123.123.123/jupyter.
4. What I came up with:
And now to my actual problem - I have a partialy working solution. Concretely, apache and rstudio are working ok, jupyter and spark are not. By not I mean that jupyter redirections are causing problems. When I go to 127.0.0.1/jupyter I am being redirected to the login page, but instead of redirecting to 127.0.0.1/jupyter/tree, it redirects me to 127.0.0.1/tree, which of course does not exist. Spark UI won't render properly, beacuse all css and js files are under 127.0.0.1/spark/some.css, but spark UI tries to get them from 127.0.0.1/some.css and the same story is basically with all other dashboards
In my actual stack I have more services like hue, kafdrop etc. and none of them work. Actually the only things that work are apache, tomcat and rstudio. I'm suprised that rstudio works without problems with authentication, logging in, out etc. It is completely ok. I actually have no idea why it works, when everything else fails.
I tried to do the same with Traefik - same outcome. With traefik I could not even set up rstudio, all dashboards suffered the same problem - not properly loading static content, or dashboards with login page - bad redirects.
5. Questions:
So my questions are:
My minimal working example is below: First initialize swarm and create network:
docker swarm init
docker network create -d overlay --attachable bigdata-net
docker-compose.yml
version: '3'
services:
nginx:
image: nginx:alpine
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- 80:80
- 443:443
networks:
- bigdata-net
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: any
apache:
image: httpd:alpine
networks:
- bigdata-net
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: any
rstudio:
image: rocker/rstudio:3.5.2
networks:
- bigdata-net
environment:
- PASSWORD=admin
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: any
jupyter:
image: jupyter/all-spark-notebook:latest
networks:
- bigdata-net
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: any
spark:
image: bde2020/spark-master:2.2.1-hadoop2.7
networks:
- bigdata-net
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: on-failure
nginx.conf
worker_processes auto;
events {
worker_connections 1024;
}
http {
log_format compression '$remote_addr - $remote_user [$time_local] '
'"$request" $status $upstream_addr '
'"$http_referer" "$http_user_agent" "$gzip_ratio"';
server {
listen 80;
listen [::]:80;
access_log /var/log/nginx/access.log compression;
######### APACHE
location = /apache { # without this only URL with tailing slash would work
return 301 /apache/;
}
location /apache/ {
set $upstream_endpoint apache:80;
resolver 127.0.0.11 valid=5s;
rewrite ^/apache(/.*) $1 break;
proxy_pass $scheme://$upstream_endpoint;
proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/apache/;
}
######### RSTUDIO
location = /rstudio { # without this only URL with tailing slash would work
return 301 /rstudio/;
}
location /rstudio/ {
set $upstream_endpoint rstudio:8787;
resolver 127.0.0.11 valid=5s;
rewrite ^/rstudio(/.*) $1 break;
proxy_pass $scheme://$upstream_endpoint;
proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/rstudio/;
}
######### JUPYTER
location = /jupyter { # without this only URL with tailing slash would work
return 301 /jupyter/;
}
location /jupyter/ {
set $upstream_endpoint jupyter:8888;
resolver 127.0.0.11 valid=5s;
rewrite ^/jupyter(/.*) $1 break;
proxy_pass $scheme://$upstream_endpoint;
proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/jupyter/;
}
######### SPARK
location = /spark { # without this only URL with tailing slash would work
return 301 /spark/;
}
location /spark/ {
set $upstream_endpoint spark:8080;
resolver 127.0.0.11 valid=5s;
rewrite ^/spark(/.*) $1 break;
proxy_pass $scheme://$upstream_endpoint;
proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/spark/;
}
}
}
Also, materials based on which I created and modified this config: https://medium.com/@joatmon08/using-containers-to-learn-nginx-reverse-proxy-6be8ac75a757 https://community.rstudio.com/t/running-shinyapps-from-rstudio-server-behind-nginx-proxy/17706/4
I hope someone can help me, I have trobles sleeping since I cannot resolve this issue ;)
Upvotes: 0
Views: 2554
Reputation: 917
I can't help with Jupyter and Spark but hope that this answer will help you.
If you plan to put something behind a reverse proxy, you should verify that it can work behind a reverse proxy, as you mentioned.
127.0.0.1/jupyter/tree, it redirects me to 127.0.0.1/tree
because for Jupyter root is /
, not /jupyter
, so you need to find in config how to change it, as an example for Grafana.
# The full public facing url you use in browser, used for redirects and emails
# If you use reverse proxy and sub path specify full url (with sub path)
root_url = https://example.com/grafana
NGINX config can be simplified, look into this example:
nginx config
# /etc/nginx/conf.d/default.conf
server {
listen 8080 default_server;
location / {
proxy_pass http://echo:8080/;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Request $request;
proxy_set_header X-Forwarded-Agent $http_user_agent;
}
location ~ /echo([0-9]+)/ {
rewrite ^/echo([0-9]+)(.*)$ $2 break;
proxy_pass http://echo:8080;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-Port $server_port;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Request $request;
proxy_set_header X-Forwarded-Agent $http_user_agent;
}
}
docker-compose
version: "3.2"
services:
nginx:
image: nginx:alpine
ports:
- '8080:8080'
volumes:
- ./default.conf:/etc/nginx/conf.d/default.conf
echo:
image: caa06d9c/echo
test
$ curl -L localhost:8080/echo1/
{
"method": "GET",
"path": "/",
"ip": "172.31.0.1",
"headers": {
"X-Forwarded-Host": "localhost",
"X-Forwarded-Port": "8080",
"X-Forwarded-Proto": "http",
"X-Forwarded-Agent": "curl/7.54.0",
"X-Forwarded-Request": "GET /echo1/ HTTP/1.1"
}
}
remarks
variables
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
should be put into location only if soft requires it, and these names, like X-Real-IP
can be different, you need to verify it with soft requirements.
You don't need
rewrite ^/rstudio(/.*) $1 break;
because nginx follow correct rules automatically, you need rewrite rule for paths like /path
, to cut off path
, so it will be /
(or something else)
resolver 127.0.0.11 valid=5s;
because you use localhost
set $upstream_endpoint jupyter:8888;
because of proxy_pass.
proxy_redirect $scheme://$upstream_endpoint/ $scheme://$host/jupyter/;
because of proxy_pass.
everything else looks good.
Upvotes: 1