Reputation: 203
Hello I was working with Pyspark, implementing a sentiment analysis project using ML package for the first time. The code was working good but suddenly it becomes showing the error mentioned above:
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:50532)
Traceback (most recent call last):
File "C:\opt\spark\spark-2.3.0-bin-hadoop2.7\python\lib\py4j-0.10.6-src.zip\py4j\java_gateway.py", line 852, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\opt\spark\spark-2.3.0-bin-hadoop2.7\python\lib\py4j-0.10.6-src.zip\py4j\java_gateway.py", line 990, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] Aucune connexion n’a pu être établie car l’ordinateur cible l’a expressément refusée
Does someone can help please Here is the full error description?
Upvotes: 15
Views: 40926
Reputation: 36
Maybe the port of spark UI is already occupied, maybe there are other errors before this error.
Maybe this can help you:https://stackoverflow.com/questions/32820087/spark-multiple-spark-submit-in-parallel
spark-submit --conf spark.ui.port=5051
Upvotes: 0
Reputation: 1306
Just restart your notebook if you are using Jupyter nootbook. If not then just restart the pyspark . that should solve the problem. It happens because you are using too many collects or some other memory related issue.
Upvotes: 13
Reputation: 700
Add more resources to Spark. For example if you're working on local mode a configuration like the following should be sufficient:
spark = SparkSession.builder \
.appName('app_name') \
.master('local[*]') \
.config('spark.sql.execution.arrow.pyspark.enabled', True) \
.config('spark.sql.session.timeZone', 'UTC') \
.config('spark.driver.memory','32G') \
.config('spark.ui.showConsoleProgress', True) \
.config('spark.sql.repl.eagerEval.enabled', True) \
.getOrCreate()
Upvotes: 9
Reputation: 45
I encountered the same problem while working on colab. I terminated the current session and reconnected. It worked for me!
Upvotes: 0
Reputation: 476
I encountered this error while trying to use PySpark within a Docker container. In my case, the error was originating from me assigning more resources to Spark than Docker had access to.
Upvotes: 7