Reputation: 61
I am facing an issue where my Spark application fails approximately once every 50 days. However, I don’t see any errors in the application logs. The only clue I found is in the NodeManager logs, which show the following error:
WARN org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Exception from container-launch with container ID: container_e225_1708884103504_1826568_02_000002 and exit code: 1
After the restart, I checked the memory usage in both the executor and the driver. In the Spark UI, the driver's memory usage appears unusual: it's showing 98.1 GB/19.1 GB.
My Questions:
Any insights or suggestions would be greatly appreciated!
Upvotes: 0
Views: 23