user1624577
user1624577

Reputation: 577

Spark dataframe will not show() - Py4JJavaError: An error occurred while calling o426.showString

I have a dataframe that I cannot .show(). Every time it gives the following error? Is it possible that there is a corrupted column?

Error:

Py4JJavaError: An error occurred while calling o426.showString. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 381.0 failed 4 times, most recent failure: Lost task 0.3 in stage 381.0 (TID 19204, ddlps28.rsc.dwo.com, executor 99): org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/lib/spark2/python/pyspark/worker.py", line 177, in main

Upvotes: 2

Views: 5841

Answers (1)

benlaird
benlaird

Reputation: 879

Your error most likely isn't actually in the "show" operation. It's that .show is what triggers execution of your DAG. You said it works if you don't run your UDF, you probably just have a different error in that UDF. The log would probably be on the worker nodes, so try access through your Hadoop UI to get access to executor logs to see what really is breaking

Upvotes: 3

Related Questions