Reputation: 2616
I have an AWS Sagemaker notebook that is I attempted to launch again. The status of the notebook has been Pending
for over 3 hours now. I've had a look at the Cloudwatch logs and the last few entry in there are:
[I 19:14:57.107 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[W 19:14:57.138 NotebookApp] No web browser found: could not locate runnable browser.
[I 19:14:57.140 NotebookApp] Starting initial scan of virtual environments...
[I 19:15:28.507 NotebookApp] Found new kernels in environments: conda_pytorch_p36, conda_amazonei_mxnet_p27, conda_chainer_p27, conda_mxnet_p27, conda_tensorflow_p27, conda_amazonei_tensorflow_p27, conda_amazonei_tensorflow_p36, conda_mxnet_p36, conda_python3, conda_tensorflow_p36, conda_python2, conda_pytorch_p27, conda_chainer_p36, conda_amazonei_mxnet_p36
There isn't anything in the logs the would indicate why it failed. Looking at that the last time I launched everything looks identical to that point. Is there anything I can do start the notebook or stop and relaunch the notebook?
Upvotes: 2
Views: 8562
Reputation: 63
Try to find more information looking at different log groups in Cloud Watch. There should be a specific log group for each Lifecycle script configuration.
I faced a similar problem, and its cause was a timeout in the Start notebook script.
Debugging and commenting on the steps helped me to solve the situation.
Ther is also the 'nohup tip' offered by Amazon to detach the installation step that is causing the problem from the script timeout restriction. Se the tip in here: https://aws.amazon.com/premiumsupport/knowledge-center/sagemaker-lifecycle-script-timeout/
Upvotes: 1