Spark Dataframe order preservation .Does calling the save operation on orderBy dataframe preserves ordering

Question

I ran some test cases from a spark shell . The statement that i executed were of the form .

read.orderBy($"p_int".asc ).write.format("com.databricks.spark.csv").save(“file:///tmp/output.txt”)

The content in the output directory seems to always be sorted. however I cannot find any documentation in spark that even related to any guarantees provided by either the DataFrameWriter in terms of preserving partition order or row order.

The question is can i always expect the data in the target file to be sorted ?and please add any link to proper documentation.

Spark Dataframe order preservation .Does calling the save operation on orderBy dataframe preserves ordering

Answers (1)

Related Questions