Alex M
Alex M

Reputation: 61

Problem with saving spark DataFrame as Parquet

I'm trying to save a DataFrame to a path as Parquet files. The issue is: the display() function shows a bunch of results in "Prop_0" but whenever I try to save them, only the first one gets converted and goes to the path.

The code I'm using is:

dbutils.fs.rm(Path_1, True)
avroFile = spark.read.format('com.databricks.spark.avro').load(Path_1)
avroFile.write.mode("overwrite").save(Path_2, format="parquet") 

Upvotes: 0

Views: 1704

Answers (1)

CHEEKATLAPRADEEP
CHEEKATLAPRADEEP

Reputation: 12768

This is expected behaviour, Hadoop File Format is used by Spark and this file format requires data to be partitioned - that's why you have part- files.

I'm able to run the above code without any issue.

enter image description here

You may use the below method to save spark DataFrame as parquet files.

enter image description here

Upvotes: 1

Related Questions