Reputation: 2423
I need to save the output of df.show() as a string so that i can email it directly.
For ex., the below example taken from official spark docs,:
val df = spark.read.json("examples/src/main/resources/people.json")
// Displays the content of the DataFrame to stdout
df.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
I need to save the above table as a string which is printed in the console. I did look at log4j to print the log, but couldnt come across any info on logging only the output.
Can someone help me with it?
Upvotes: 16
Views: 6306
Reputation: 18424
scala.Console
has a withOut
method for this kind of thing:
val outCapture = new ByteArrayOutputStream
Console.withOut(outCapture) {
df.show()
}
val result = new String(outCapture.toByteArray)
Upvotes: 26
Reputation: 16076
Workaround is to redirect standard output to variable:
val baos = new java.io.ByteArrayOutputStream();
val ps = new java.io.PrintStream(baos);
val oldPs = Console.out
Console.setOut(ps)
df.show()
val content = baos.toString()
Console.setOut(oldPs)
Note that I have one deprecation warning here.
You can also re-implement method Dataset.showString
, which generated data. It uses take
in background. Maybe it's also a good moment to create PR to make showString
public? :)
Upvotes: 6