Reputation: 2316
I have a DataProc cluster that initialize DataLab, and install Jupyter and Zeppelin as optional components. I want to make the Jupyter port as 8124, and Zeppelin port as 8081 at cluster creation time. I need them to be exclusively in these two ports and not any other ports. I used the following command with gcloud dataproc clusters create
at cluster creation time:
--metadata ZEPPELIN-PORT=8081
(tried --metadata zeppelin-port=8081
as well)
--metadata JUPYTER_PORT=8124
However, they are both still using their default port, i.e., 8123 for jupyter and 8080 for zeppelin, while 8124 and 8081 are unavailable. What makes things worse, since DataLab uses 8080 by default as well, I'm unable to access DataLab from this port but only zeppelin.
I can customize the port AFTER creation time, but that's not ideal for my use cases.
Any suggestions are appreciated. Thank you.
Upvotes: 2
Views: 315
Reputation: 905
Using last Dataproc version you should be able to remap the ports
Image 1.3 and 1.4: Allow remapping Jupyter and Zeppelin Optional Component ports via dataproc:{jupyter,zeppelin}.port properties
https://cloud.google.com/dataproc/docs/release-notes#may_9_2019
Upvotes: 2
Reputation: 10687
Unfortunately there is indeed no way to do this in a first-class supported property at the moment, but it could become a feature in Dataproc someday in the future.
In the meantime, however, running an initialization action which modifies the ports should be effectively equivalent to modifying it through a property, with just a few seconds of delay to reboot the services.
The following init action will remap Jupyter to 8124 and Zeppelin 8081 automatically at cluster-creation time, and also works with Dataproc Component Gateway if that is enabled.
#!/bin/bash
# change-ports.sh
ZEPPELIN_PORT=8081
JUPYTER_PORT=8124
readonly ROLE="$(/usr/share/google/get_metadata_value attributes/dataproc-role)"
if [[ "${ROLE}" == 'Master' ]]; then
if [ -f /etc/zeppelin/conf/zeppelin-env.sh ]; then
echo "export ZEPPELIN_PORT=${ZEPPELIN_PORT}" \
>> /etc/zeppelin/conf/zeppelin-env.sh
systemctl restart zeppelin
fi
if [ -f /etc/jupyter/jupyter_notebook_config.py ]; then
echo "c.NotebookApp.port = ${JUPYTER_PORT}" \
>> /etc/jupyter/jupyter_notebook_config.py
systemctl restart jupyter
fi
if [ -f /etc/knox/conf/topologies/default.xml ]; then
sed -i "s/localhost:8080/localhost:${ZEPPELIN_PORT}/g" \
/etc/knox/conf/topologies/default.xml
sed -i "s/localhost:8123/localhost:${JUPYTER_PORT}/g" \
/etc/knox/conf/topologies/default.xml
systemctl restart knox
fi
fi
Upvotes: 0