Writing parquet data into S3 using saveAsTable does not complete

Question

Using Spark 2.0.2 on EC2 machines, I have been trying to write tables into S3 in parquet format with partitions, but the application never seems to finish. I can see that Spark has written files into the S3 bucket/folder under _temporary, and that once the Spark saveAsTable JOB finishes, the application hangs.

Taking a look at s3 shows that the partitions are generated with data inside the folder partitions (spot checked), but the _temporary folder is still there, and show tables does not include the new table.

Is anyone else experiencing this or has a solution?

Does anyone know what goes on underneath the saveAsTable command?

Writing parquet data into S3 using saveAsTable does not complete

Answers (1)

Related Questions