Reputation: 227
I am installing spark 1.2.1 on windows 8 and I have downloaded a prebuilt package for Hadoop 2.4
When i am running pyspark i am getting the following error:
C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4>bin\pyspark
Running python with PYTHONPATH=C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\bin\..\python\lib\py4j-0.8.2.1-src.zip;C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\bin\..\python;
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
The system cannot find the path specified.
Traceback (most recent call last):
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-Hadoop2.4\bin\..\python\pyspark\shell.py", line 45, in <module>
sc = SparkContext(appName="PySparkShell", pyFiles=add_files)
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py", line 102, in __init__
SparkContext._ensure_initialized(self, gateway=gateway)
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\python\pyspark\context.py", line 212, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway()
File "C:\Users\Dinesh\Desktop\spark-1.2.1-bin-hadoop2.4\python\pyspark\java_gateway.py", line 73, in launch_gateway
raise Exception(error_msg)
Exception: Launching GatewayServer failed with exit code 1!
Warning: Expected GatewayServer to output a port, but found no output.
I have searched and i got that in general the error is caused as the path variable are not correctly defined,but i have checked and my variable are all in place. How can i solve the error? "The system cannot find the path specified." Which path is it talking about.?
Upvotes: 5
Views: 11859
Reputation: 111
In my case the problem came from the terminal I was using. On Git Bash on Windows I was getting the error : line 96: CMD: bad array subscript
when executing spark-shell
, but when I tried on PowerShell it worked fine.
Upvotes: 7
Reputation: 406
The way I debugged this issue was to rem the "@echo off" command on all the command files that got called by pyspark.cmd. In the end I nailed it down to me having JAVA_HOME set to "C:\ProgramData\Oracle\Java\javapath" which is wrong as one of the cmd scripts adds a "\bin" to JAVA_HOME before calling java.exe and it was triggering the "The system cannot find the path specified." error. So I changed JAVA_HOME to "C:\Program Files\Java\jdk1.8.0_25" and it worked fine.
Now I have to un-rem the "@echo off"s. Hope it helps!
Upvotes: 2
Reputation: 3228
It maybe caused by cygwin in the DOS classpath. Spark uses the find command in the file 'spark-class2.cmd', which used then the cygwin find command instead of the DOS find command, which works somewhat different. I removed cygwin from the DOS PATH, which solved the problem.
Upvotes: 0