user2830451
user2830451

Reputation: 2316

Can't customize port for Jupyter and Zeppelin at Google DataProc creation time

I have a DataProc cluster that initialize DataLab, and install Jupyter and Zeppelin as optional components. I want to make the Jupyter port as 8124, and Zeppelin port as 8081 at cluster creation time. I need them to be exclusively in these two ports and not any other ports. I used the following command with gcloud dataproc clusters create at cluster creation time:

--metadata ZEPPELIN-PORT=8081 (tried --metadata zeppelin-port=8081 as well)

--metadata JUPYTER_PORT=8124

However, they are both still using their default port, i.e., 8123 for jupyter and 8080 for zeppelin, while 8124 and 8081 are unavailable. What makes things worse, since DataLab uses 8080 by default as well, I'm unable to access DataLab from this port but only zeppelin.

I can customize the port AFTER creation time, but that's not ideal for my use cases.

Any suggestions are appreciated. Thank you.

Upvotes: 2

Views: 315

Answers (2)

Diego Rodríguez
Diego Rodríguez

Reputation: 905

Using last Dataproc version you should be able to remap the ports

Image 1.3 and 1.4: Allow remapping Jupyter and Zeppelin Optional Component ports via dataproc:{jupyter,zeppelin}.port properties

https://cloud.google.com/dataproc/docs/release-notes#may_9_2019

Upvotes: 2

Dennis Huo
Dennis Huo

Reputation: 10687

Unfortunately there is indeed no way to do this in a first-class supported property at the moment, but it could become a feature in Dataproc someday in the future.

In the meantime, however, running an initialization action which modifies the ports should be effectively equivalent to modifying it through a property, with just a few seconds of delay to reboot the services.

The following init action will remap Jupyter to 8124 and Zeppelin 8081 automatically at cluster-creation time, and also works with Dataproc Component Gateway if that is enabled.

#!/bin/bash
# change-ports.sh

ZEPPELIN_PORT=8081
JUPYTER_PORT=8124

readonly ROLE="$(/usr/share/google/get_metadata_value attributes/dataproc-role)"

if [[ "${ROLE}" == 'Master' ]]; then
  if [ -f /etc/zeppelin/conf/zeppelin-env.sh ]; then
    echo "export ZEPPELIN_PORT=${ZEPPELIN_PORT}" \
        >> /etc/zeppelin/conf/zeppelin-env.sh
    systemctl restart zeppelin
  fi

  if [ -f /etc/jupyter/jupyter_notebook_config.py ]; then
    echo "c.NotebookApp.port = ${JUPYTER_PORT}" \
        >> /etc/jupyter/jupyter_notebook_config.py
    systemctl restart jupyter
  fi

  if [ -f /etc/knox/conf/topologies/default.xml ]; then
    sed -i "s/localhost:8080/localhost:${ZEPPELIN_PORT}/g" \
        /etc/knox/conf/topologies/default.xml
    sed -i "s/localhost:8123/localhost:${JUPYTER_PORT}/g" \
        /etc/knox/conf/topologies/default.xml
    systemctl restart knox
  fi
fi

Upvotes: 0

Related Questions