Reputation: 7746
This docker-compose.yml
with one datanode
seems to work ok:
version: "3"
services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
container_name: namenode
restart: always
ports:
- 9870:9870
- 9010:9000
volumes:
- hadoop_namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME=test
- CORE_CONF_fs_defaultFS=hdfs://namenode:9000
env_file:
- ./hadoop.env
datanode:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode
restart: always
volumes:
- hadoop_datanode:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
CORE_CONF_fs_defaultFS: hdfs://namenode:9000
ports:
- "9864:9864"
env_file:
- ./hadoop.env
resourcemanager:
image: bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8
container_name: resourcemanager
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864"
env_file:
- ./hadoop.env
nodemanager1:
image: bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8
container_name: nodemanager
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088"
env_file:
- ./hadoop.env
historyserver:
image: bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8
container_name: historyserver
restart: always
environment:
SERVICE_PRECONDITION: "namenode:9000 namenode:9870 datanode:9864 resourcemanager:8088"
volumes:
- hadoop_historyserver:/hadoop/yarn/timeline
env_file:
- ./hadoop.env
spark-master:
image: bde2020/spark-master:3.0.0-hadoop3.2
container_name: spark-master
depends_on:
- namenode
- datanode
ports:
- "8080:8080"
- "7077:7077"
environment:
- INIT_DAEMON_STEP=setup_spark
- CORE_CONF_fs_defaultFS=hdfs://namenode:9000
spark-worker-1:
image: bde2020/spark-worker:3.0.0-hadoop3.2
container_name: spark-worker-1
depends_on:
- spark-master
ports:
- "8081:8081"
environment:
- "SPARK_MASTER=spark://spark-master:7077"
- CORE_CONF_fs_defaultFS=hdfs://namenode:9000
hive-server:
image: bde2020/hive:2.3.2-postgresql-metastore
container_name: hive-server
depends_on:
- namenode
- datanode
env_file:
- ./hadoop-hive.env
environment:
HIVE_CORE_CONF_javax_jdo_option_ConnectionURL: "jdbc:postgresql://hive-metastore/metastore"
SERVICE_PRECONDITION: "hive-metastore:9083"
ports:
- "10000:10000"
hive-metastore:
image: bde2020/hive:2.3.2-postgresql-metastore
container_name: hive-metastore
env_file:
- ./hadoop-hive.env
command: /opt/hive/bin/hive --service metastore
environment:
SERVICE_PRECONDITION: "namenode:9870 datanode:9864 hive-metastore-postgresql:5432"
ports:
- "9083:9083"
hive-metastore-postgresql:
image: bde2020/hive-metastore-postgresql:2.3.0
container_name: hive-metastore-postgresql
presto-coordinator:
image: shawnzhu/prestodb:0.181
container_name: presto-coordinator
ports:
- "8089:8089"
volumes:
hadoop_namenode:
hadoop_datanode:
hadoop_historyserver:
I want to modify it so that it uses three datanodes
. I tried adding this right below the original datanode
section, but it seems to not like it. It basically adds new names, and new ports:
datanode1:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode1
restart: always
volumes:
- hadoop_datanode:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
CORE_CONF_fs_defaultFS: hdfs://namenode:9000
ports:
- "9865:9865"
env_file:
- ./hadoop.env
datanode2:
image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8
container_name: datanode2
restart: always
volumes:
- hadoop_datanode:/hadoop/dfs/data
environment:
SERVICE_PRECONDITION: "namenode:9870"
CORE_CONF_fs_defaultFS: hdfs://namenode:9000
ports:
- "9866:9866"
env_file:
- ./hadoop.env
Should this work, and if not, what do I need to change to get three datanodes
?
Upvotes: 0
Views: 3029
Reputation: 53
Check your ports
setting. It seems that the port mapping is faulty. You have "9865:9865" (datanode1) and "9866:9866" (datanode2).
Try setting it to "9865:9864" and "9866:9864" respectively, as 9864 is the default port that the datanode is using, and the first port number defines how the datanode shall be reachable outside the docker network.
With the suggested configuration, your datanodes will be reachable on datanode:9864 (datanode1:9864, datanode2:9864) from within the network, and on :9864 (and :9865, :9866) from outside the docker network.
Upvotes: 1