Carbon
Carbon

Reputation: 3943

Is there a way to set spark csv number format?

If I'm using myDF.write.csv("wherever"), how can I set the numeric format for stored data? EG, if I do:

val t = spark.sql("SELECT cast(1000000000000 as double) as aNum")
t.write.csv("WXYZ")

and then review WXYZ, I will find I have 1.0E12. How could I change this for all doubles such that I get 1000000000000.00?

Upvotes: 1

Views: 1120

Answers (2)

skywalkerytx
skywalkerytx

Reputation: 244

if data comes from hive there is a hive udf printf u can use:

select printf('%.2f', col) from foobar

planB:

dataset.map( col => s"$col%.2f")

take care of planB, there may be extra cost based on your data source

btw, sometimes it is likely just a problem of display in excel, just check the csv with text editor

Upvotes: 0

Steven Black
Steven Black

Reputation: 2232

The way I've handled this issue is by casting the number to a string

val t = spark.sql("SELECT cast(1000000000000 as string) as aNum")
t.write.csv("WXYZ")
t.show()

And the output is

+-------------+
|         aNum|
+-------------+
|1000000000000|
+-------------+

:) I hope this helps!

Upvotes: 1

Related Questions