HMK
HMK

Reputation: 608

Error in using pyspark with Jupyter

I followed the instructions given on this website, but everytime I open a new pyspark notebook I'm still getting the following kernel error. How would I go about resolving this?

[E 15:39:28.693 NotebookApp] Failed to run command:
[u'/anaconda/bin/python', u'-m', u'ipykernel', u'-f', u'/run/user/1000/jupyter/kernel-f04c7a43-accb-403b-9632-d47e6728387e.json']
    PATH='/home/username/anaconda2/bin:/srv/spark/bin:/usr/local/scala/bin:/home/username/anaconda2/bin:/home/username/anaconda2/bin:/srv/spark/bin:/home/username/bin:/home/username/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/db/bin:/usr/lib/jvm/java-8-oracle/jre/bin'
    with kwargs:
{'cwd': u'/home/username', 'stdin': -1, 'preexec_fn': <function <lambda> at 0x7f7280b3c320>, 'stderr': None, 'stdout': None}

 [E 15:39:28.712 NotebookApp] Unhandled error in API request
Traceback (most recent call last):
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/base/handlers.py", line 457, in wrapper
    result = yield gen.maybe_future(method(self, *args, **kwargs))
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run
    yielded = self.gen.throw(*exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/sessions/handlers.py", line 62, in post
    kernel_id=kernel_id))
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run
    yielded = self.gen.throw(*exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/sessions/sessionmanager.py", line 79, in create_session
    kernel_name)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1021, in run
    yielded = self.gen.throw(*exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/sessions/sessionmanager.py", line 92, in start_kernel_for_session
    self.kernel_manager.start_kernel(path=kernel_path, kernel_name=kernel_name)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 1015, in run
    value = future.result()
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/concurrent.py", line 237, in result
    raise_exc_info(self._exc_info)
  File "/home/username/anaconda2/lib/python2.7/site-packages/tornado/gen.py", line 285, in wrapper
    yielded = next(result)
  File "/home/username/anaconda2/lib/python2.7/site-packages/notebook/services/kernels/kernelmanager.py", line 87, in start_kernel
    super(MappingKernelManager, self).start_kernel(**kwargs)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/multikernelmanager.py", line 110, in start_kernel
    km.start_kernel(**kwargs)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/manager.py", line 243, in start_kernel
    **kw)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/manager.py", line 189, in _launch_kernel
    return launch_kernel(kernel_cmd, **kw)
  File "/home/username/anaconda2/lib/python2.7/site-packages/jupyter_client/launcher.py", line 123, in launch_kernel
    proc = Popen(cmd, **kwargs)
  File "/home/username/anaconda2/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/home/username/anaconda2/lib/python2.7/subprocess.py", line 1343, in _execute_child
    raise child_exception

Upvotes: 0

Views: 1033

Answers (1)

Grr
Grr

Reputation: 16079

Im not sure where you got that website from, but getting jupyter to work is much easier than this. All you need to do is set the environment variables PYSPARK_DRIVER_PYTHON=jupyter and PYSPARK_DRIVER_PYTHON_OPTS='notebook' and then run pyspark. There are actually directions for this embedded in the pyspark command located in spark/bin.

If you are running PySpark on a cluster and need to access your notebook from a networked computer make sure to add ip and port values to your PYSPARK_DRIVER_PYTHON_OPTS string. Like so:

export PYSPARK_DRIVER_PYTHON_OPTS='notebook --ip=0.0.0.0 --port=8899'

Then you can just open up a browser and type in computer_name:8899 (where computer name is the name of the box from which you launched pyspark) and you will find your notebook.

Upvotes: 1

Related Questions