AlexModestov
AlexModestov

Reputation: 381

apache spark "Py4JError: Answer from Java side is empty"

I get this error every time... I use sparkling water... My conf-file:

***"spark.driver.memory 65g
spark.python.worker.memory 65g
spark.master local[*]"***

The amount of data is about 5 Gb. There is no another information about this error... Does anybody know why it happens? Thank you!

***"ERROR:py4j.java_gateway:Error while sending or receiving.
Traceback (most recent call last):
  File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 746, in send_command
    raise Py4JError("Answer from Java side is empty")
Py4JError: Answer from Java side is empty
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
  File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/local/anaconda/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
  File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/local/anaconda/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server
Traceback (most recent call last):
  File "/data/analytics/Spark1.6.1/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 690, in start
    self.socket.connect((self.address, self.port))
  File "/usr/local/anaconda/lib/python2.7/socket.py", line 228, in meth
    return getattr(self._sock,name)(*args)
error: [Errno 111] Connection refused"***

Upvotes: 22

Views: 48924

Answers (3)

jake wong
jake wong

Reputation: 5228

Another point to note if you are on wsl2 using pyspark. Ensure that your wsl2 config file has an increased memory.

# Settings apply across all Linux distros running on WSL 2
[wsl2]

# Limits VM memory to use no more than 4 GB, this can be set as whole numbers using GB or MB
memory=12GB  # This was originally set to 3gb which caused me to fail since spark.executor.memory and spark.driver.memory was only able to MAX of 3gb regardless of how high i set it. 

# Sets the VM to use eight virtual processors
processors=8

for reference. your .wslconfig config file should be located in C:\Users\USERNAME

Upvotes: 0

Serhii Sokolenko
Serhii Sokolenko

Reputation: 6164

Usually, you'll see this error when the Java process get silently killed by the OOM Killer.

The OOM Killer (Out of Memory Killer) is a Linux process that kicks in when the system becomes critically low on memory. It selects a process based on its "badness" score and kills it to reclaim memory. Read more on OOM Killer here.

Increasing spark.executor.memory and/or spark.driver.memory values will only make things worse in this case, i.e. you may want to do the opposite!

Other options would be to:

  • increase the number of partitions if you're working with very big data sources;
  • increase the number of worker nodes;
  • add more physical memory to worker/driver nodes;

Or, if you're running your driver/workers using docker:

  • increase docker memory limit;
  • set --oom-kill-disable on your containers, but make sure you understand possible consequences!

Read more on --oom-kill-disable and other docker memory settings here.

Upvotes: 11

Nigel Ng
Nigel Ng

Reputation: 583

Have you tried setting spark.executor.memory and spark.driver.memory in your Spark configuration file?

See https://stackoverflow.com/a/22742982/5453184 for more info.

Upvotes: 7

Related Questions