Rajashree Gr
Rajashree Gr

Reputation: 589

Spark saveAsTextFile with file extension

I want to partition my results and save them as a CSV file into a specified location. However, I didn't find any option to specify the file format using the below code. All the files are created with the format part-000**. How can I specify the required file format here?

records.repartition(partitionNum).saveAsTextFile(path)

Upvotes: 0

Views: 630

Answers (1)

Junhua.xie
Junhua.xie

Reputation: 174

you can try this

df.coalesce(1).write.option("header",true).csv(path)

this path it will be a folder, and it must not be exists, and you can't generate specify csv file. But you can change the hdfs file name by hadoop api(contains in spark).

import org.apache.hadoop.fs._
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
val file = fs.globStatus(new Path(s"$path/part*"))(0).getPath().getName()
val result:Boolean = fs.rename(new Path(s"$path/$file"), new Path(s"$hdfsFolder/${fileName}"))

Upvotes: 1

Related Questions