user2871856
user2871856

Reputation: 227

Error in Spark installation --pyspark

I am installing spark 1.2.1 on windows 8 and I have downloaded a prebuilt package for Hadoop 2.4

When i am running pyspark i am getting the following error:

C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4>bin\pyspark
Running python with PYTHONPATH=C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\bin\..\python\lib\py4j-0.8.2.1-src.zip;C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\bin\..\python;
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
The system cannot find the path specified.
Traceback (most recent call last):
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-Hadoop2.4\bin\..\python\pyspark\shell.py", line 45, in <module>
sc = SparkContext(appName="PySparkShell", pyFiles=add_files)
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py", line 102, in __init__
SparkContext._ensure_initialized(self, gateway=gateway)
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py", line 212, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\python\pyspark\java_gateway.py", line 73, in launch_gateway
raise Exception(error_msg)
Exception: Launching GatewayServer failed with exit code 1!
Warning: Expected GatewayServer to output a port, but found no output.

I have searched and i got that in general the error is caused as the path variable are not correctly defined,but i have checked and my variable are all in place. How can i solve the error? "The system cannot find the path specified." Which path is it talking about.?

Upvotes: 5

Views: 11859

Answers (3)

mecorre1
mecorre1

Reputation: 111

In my case the problem came from the terminal I was using. On Git Bash on Windows I was getting the error : line 96: CMD: bad array subscript when executing spark-shell, but when I tried on PowerShell it worked fine.

Upvotes: 7

alexs
alexs

Reputation: 406

The way I debugged this issue was to rem the "@echo off" command on all the command files that got called by pyspark.cmd. In the end I nailed it down to me having JAVA_HOME set to "C:\ProgramData\Oracle\Java\javapath" which is wrong as one of the cmd scripts adds a "\bin" to JAVA_HOME before calling java.exe and it was triggering the "The system cannot find the path specified." error. So I changed JAVA_HOME to "C:\Program Files\Java\jdk1.8.0_25" and it worked fine.

Now I have to un-rem the "@echo off"s. Hope it helps!

Upvotes: 2

Shawn Guo
Shawn Guo

Reputation: 3228

It maybe caused by cygwin in the DOS classpath. Spark uses the find command in the file 'spark-class2.cmd', which used then the cygwin find command instead of the DOS find command, which works somewhat different. I removed cygwin from the DOS PATH, which solved the problem.

Upvotes: 0

Related Questions