Davide Caldara
Davide Caldara

Reputation: 21

pyspark and jupyter notebook doesn't work windows 10

I followed this guide step by step to install pyspark and jupyter notebook on my PC windows 10: http://www.jbencina.com/blog/2017/07/15/installing-pyspark-jupyter-notebook-windows/

I've set everything exactly as the guide says, but when I run the command "pyspark" I receive this error message:

error executing Jupyter command 'notebook': [Errno 'jupyter-notebook' not found] 2

Tried to look for a solution but I didn't find any case specifically like mine, and the most similar were about pyspark on linux.

If anyone could explain me what I need to change in order to make it work, I would be greateful! If anyone also have some other guide about how to use pyspark on windows, would be great too, I'm still a newbie.

Upvotes: 1

Views: 2055

Answers (1)

Vinay Chaudhari
Vinay Chaudhari

Reputation: 128

INSTALL PYSPARK on Windows 10 JUPYTER-NOTEBOOK With ANACONDA NAVIGATOR

STEP 1

Download Packages

1) spark-2.2.0-bin-hadoop2.7.tgz Download

2) java jdk 8 version Download

3) Anaconda v 5.2 Download

4) scala-2.12.6.msi Download

5) hadoop v2.7.1Download

STEP 2

MAKE SPARK FOLDER IN C:/ DRIVE AND PUT EVERYTHING INSIDE IT It will look like this

NOTE : DURING INSTALLATION OF SCALA GIVE PATH OF SCALA INSIDE SPARK FOLDER

STEP 3

NOW SET NEW WINDOWS ENVIRONMENT VARIABLES

  1. HADOOP_HOME=C:\spark\hadoop

  2. JAVA_HOME=C:\Program Files\Java\jdk1.8.0_151

  3. SCALA_HOME=C:\spark\scala\bin

  4. SPARK_HOME=C:\spark\spark\bin

  5. PYSPARK_PYTHON=C:\Users\user\Anaconda3\python.exe

  6. PYSPARK_DRIVER_PYTHON=C:\Users\user\Anaconda3\Scripts\jupyter.exe

  7. PYSPARK_DRIVER_PYTHON_OPTS=notebook

  8. NOW SELECT PATH OF SPARK :

    Click on Edit and add New

    Add "C:\spark\spark\bin” to variable “Path” Windows

STEP 4

  • Make folder where you want to store Jupyter-Notebook outputs and files
  • After that open Anaconda command prompt and cd Folder name
  • then enter Pyspark

thats it your browser will pop up with Juypter localhost

STEP 5

Check pyspark is working or not !

Type simple code and run it

from pyspark.sql import Row
a = Row(name = 'Vinay' , age=22 , height=165)
print("a: ",a)

Upvotes: -1

Related Questions