Reputation: 1746
I did Spark mooc course in EDX. And I wanted to work further in this setup. I could create code and run few things. But can't update python in it. I wanted to install python package scipy
.
I followed the instruction given in group, on installing Anaconda. And I could install anaconda in SparkVM. Please find the screenshot below.
But when I try to run when I try to run any code that required "pandas" or "scipy", it can't import it. Please find the screenshot below. Can anybody please help me.
Even though this question is not exactly relevant here. But still asking as in case somebody also did the same course and could update Sparkvm. Please find below the screenshot of my SparkVM details.
Thanks a lot!
Upvotes: 0
Views: 248
Reputation: 330193
The simplest thing you can do is to ignore Anaconda and install SciPy
globally. Either from shell:
sudo aptitude update
sudo aptitude install -y python-scipy
or from IPython notebook:
!sudo aptitude update
!sudo aptitude install -y python-scipy
Since system packages are usually outdated you may prefer to use pip
:
!pip install --user scipy
To properly configure Anaconda you can edit /home/vagrant/spark_notebook.py
and PYSPARK_PYTHON
/ PYSPARK_DRIVER_PYTHON
variables:
setenv('PYSPARK_PYTHON', '/path/to/anaconda/bin/python', overwrite=False)
setenv('PYSPARK_DRIVER_PYTHON', '/path/to/anaconda/bin/ipython')
Upvotes: 2