Patrick Glettig
Patrick Glettig

Reputation: 601

How to troubleshoot 'pyspark' is not recognized... error on Windows?

It has been two weeks during which I have been trying to install Spark (pyspark) on my Windows 10 machine, now I realized that I need your help.

When I try to start 'pyspark' in the command prompt, I still receive the following error:

The Problem

'pyspark' is not recognized as an internal or external command, operable program or batch file.

To me this hints at a problem with the path/environmental variables, but I cannot find the root of the problem.

My Actions

I have tried multiple tutorials but the best I found was the one by Michael Galarnyk. I followed his tutorial step by step:

These actions should have done the trick, but when I run pyspark --master local[2], I still get the error from above. Can you help to track down this error using the information from above?

Checks

I ran a couple of checks in the command prompt to verify the following:

Upvotes: 7

Views: 21021

Answers (2)

Vishnu Kant Tripathi
Vishnu Kant Tripathi

Reputation: 31

Follow the given steps explained in my blog will resolve your problem-

How to Setup PySpark on Windows https://beasparky.blogspot.com/2020/05/how-to-setup-pyspark-in-windows.html

To set up the environment paths for Spark.

Go to "Advanced System Settings" and set below paths
JAVA_HOME="C:\Program Files\Java\jdk1.8.0_181"
HADOOP_HOME="C:\spark-2.4.0-bin-hadoop2.7"
SPARK_HOME="C:\spark-2.4.0-bin-hadoop2.7"
Also, add their bin path into the PATH system variable

Upvotes: 2

mchl_k
mchl_k

Reputation: 314

I resolved this issue by setting the variables as "system variables" rather than "user variables". Note

  1. In my case setting variables from command line resulted in "user variables" so I had to use the Advanced settings GUI to enter values as "system variables"
  2. You may want to rule out any installation issue, in which case try to cd into C:\opt\spark\spark-2.3.1-bin-hadoop2.7\bin and run pyspark master local[2] (make sure winutils.exe is there); if that does not work then you have other issues than just env variables

Upvotes: 4

Related Questions