Reputation: 1767
Downloaded the latest Spark
version because of the fix for
ERROR AsyncEventQueue:70 - Dropping event from queue appStatus.
After setting environment variables and running the same code in PyCharm
, I'm getting this error, which I can't find a solution of.
Exception in thread "main" java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CONN_INFO_PATH
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:59)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:59)
at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:64)
at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Any help?
Upvotes: 5
Views: 8262
Reputation: 41
i met this question too. The next is what i do, hoping to help you:
1 . find your spark version, my spark's version is 2.4.3;
2 . find your pyspark version, my pyspark,version is 2.2.0;
3 . reinstall your pyspark as same as the spark's version
pip install pyspark==2.4.3
Then everything is ok. Hope to help you.
Upvotes: 4
Reputation: 8817
This can happen when you are running spark 2.3.1 jars with an older version of pyspark (eg: 2.3.0)
Upvotes: 0
Reputation: 1501
I am using Pyspark 2.3.1 with pycharm 2018.1.4 and facing similar issue on my windows machine. When I run this python file using spark-submit, it get executed successfully.
I have followed below steps
C:\apache-spark
and
C:\apache-spark\python\lib\py4j-0.10.7-src.zip
Created new python
file(Demo1.py) and pasted below content inside it.
from pyspark import SparkContext
sc = SparkContext(master="local", appName="Spark Demo")
rdd = sc.textFile("C:/apache-spark/README.md")
wordsRDD = rdd.flatMap(lambda words: words.split(" "))
wordsRDD = wordsRDD.map(lambda word: (word, 1))
wordsCount = wordsRDD.reduceByKey(lambda x, y: x+y)
print wordsCount.collect()
Running this python file on pycharm gives below error
Exception in thread "main" java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CONN_INFO_PATH
Where as same program when executed from command prompt yields correct result.
C:\Users\manish>spark-submit C:\Demo\demo1.py
Any suggestions to solve this problem?
Upvotes: 2
Reputation: 8523
I had this problem too, and it ended up being the pyspark code I was importing/running from PyCharm was still the spark 2.2 install instead of the spark 2.3 installation that I had updated SPARK_HOME
to point to.
Specifically, I added spark-2.2 to my PyCharm project structure and then marked it's python folder a "Sources" so PyCharm would recognize all it's symbols. So the PyCharm code was importing from there, instead of spark-2.3, and the older code didn't set the _PYSPARK_DRIVER_CONN_INFO_PATH
environment variable.
If Vezir's answer didn't fix your case, try tracing into the creation SparkContext and compare carefully the path that is being read from as opposed to path of your spark install. Similarly, if you installed pyspark into your python project via pip, make sure you installed 2.3.1 to match your installed spark version.
Upvotes: 0
Reputation: 101
I have had a similar exception. My problem was running jupyter and spark with different users. When I run them with the same user problem is solved.
Details;
When I updated spark from v2.2.0 to v2.3.1 then run the Jupyter notebook, the error log was as follows;
Exception in thread "main" java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CONN_INFO_PATH
at scala.collection.MapLike$class.default(MapLike.scala:228)
at scala.collection.AbstractMap.default(Map.scala:59)
at scala.collection.MapLike$class.apply(MapLike.scala:141)
at scala.collection.AbstractMap.apply(Map.scala:59)
at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:64)
at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
When I googling I encountered the following link; spark-commits mailing list archives In the code
/core/src/main/scala/org/apache/spark/api/python/PythonGatewayServer.scala
there is a change
+ // Communicate the connection information back to the python process by writing the
+ // information in the requested file. This needs to match the read side in java_gateway.py.
+ val connectionInfoPath = new File(sys.env("_PYSPARK_DRIVER_CONN_INFO_PATH"))
+ val tmpPath = Files.createTempFile(connectionInfoPath.getParentFile().toPath(),
+ "connection", ".info").toFile()
According to this change it is created a temp dir and a file in it. My problem was running jupyter and spark with different users. Because of this I think the process could not created the temp file. When I run them with the same user problem solved. I hope it helps.
Upvotes: 0