Reputation: 1230
I have created an AWS EMR cluster and notebook using default settings.
When I open the notebook, the kernel won't launch. I get the message "Workspace is not attached to cluster".
A clue
I looked at the log files created by a cluster where the notebook failed.
In the log file https://aws-logs-***.s3.amazonaws.com/elasticmapreduce/j-3SOK08VFSQDPO/node/i-04af0a3d2d6d96cac/daemons/emr-on-cluster-env/gateway.log.gz
, I found the following:
Jupyter Enterprise Gateway 2.1.0 is available at http://127.0.0.1:9547
User 'root' is not authorized to start kernel 'Python 3'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
User 'root' is not authorized to start kernel 'PySpark'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
How I got the notebook kernel to work
Per the Stackoverflow post Notebooks on EMR (AWS): Failed to start kernel, I switched from using the root AWS account, to an IAM user. This worked with EMR 6.5.0.
My question
What changed when I launched the cluster with an IAM account? How could I have figured out that using the root user is the problem?
EMR is a black box to me. Thanks in advance for helping me understand the inner workings of this amazing technology.
Upvotes: 6
Views: 3371
Reputation: 5915
This is the key issue:
User 'root' is not authorized to start kernel 'Python 3'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
User 'root' is not authorized to start kernel 'PySpark'. Ensure KERNEL_USERNAME is set to an appropriate value and retry the request.
You need to create a normal IAM account, with EMR permission, login with that user, and start the notebook from there. Your main AWS account is root account. I talked to the AWS support and got my notebook running that way.
Upvotes: 4