Volatil3
Volatil3

Reputation: 14978

Unable to access Spark nodes in Docker

I am using this setup (https://github.com/mvillarrealb/docker-spark-cluster.git) to established a Spark Cluster but none of the IPs mentioned there like 10.5.0.2 area accessible via browser and giving timeout. I am unable to figure out what's wrong am I doing?

I am using Docker 2.3 on macOS Catalina.

In the spark-base Dockerfile I am using the following settings instead of one given there:

ENV DAEMON_RUN=true
ENV SPARK_VERSION=3.0.0
ENV HADOOP_VERSION=3.2
ENV SCALA_VERSION=2.12.4
ENV SCALA_HOME=/usr/share/scala
ENV SPARK_HOME=/spark

Also when running, it still shows Spark 2.4.3 on the console when trying to run web UI.

Upvotes: 1

Views: 1556

Answers (1)

Neo Anderson
Neo Anderson

Reputation: 6350

The Dockerfile tells the container what port to expose.
The compose-file tells the host which ports to expose and to which ports should be the traffic forwarded inside the container.
If the source port is not specified, a random port should be generated. This statement helps in this scenario because you have multiple workers and you cannot specify a unique source port for all of them - this would result in a conflict.

version: "3.7"
services:
  spark-master:
    image: spydernaz/spark-master:latest
    ports:
      - "9090:8080"
      - "7077:7077"
    volumes:
       - ./apps:/opt/spark-apps
       - ./data:/opt/spark-data
    environment:
      - "SPARK_LOCAL_IP=spark-master"
  spark-worker:
    image: spydernaz/spark-worker:latest
    depends_on:
      - spark-master
    ports:
      - "8081"
    environment:
      - SPARK_MASTER=spark://spark-master:7077
      - SPARK_WORKER_CORES=1
      - SPARK_WORKER_MEMORY=1G
      - SPARK_DRIVER_MEMORY=128m
      - SPARK_EXECUTOR_MEMORY=256m
    volumes:
       - ./apps:/opt/spark-apps
       - ./data:/opt/spark-data

To find the randomly generated published port for each of the workers, run docker ps. Under the column PORTS you should find what you need:

PORTS 
0.0.0.0:32768->8080/tcp 

32768 will forward from the host machine (localhost:32768) to the [worker-IP]:8080

Upvotes: 1

Related Questions