Pralay Ghosh
Pralay Ghosh

Reputation: 15

how to format scala's output from JSON to text file format

I am using Scala with Spark with below version.

Scala - 2.10.4 Spark - 1.2.0

I am mentioning below my situation.

I have a RDD(Say - JoinOp) with nested tuples(having case classes), for example -

(123,(null,employeeDetails(Smith,NY,DW))) 
(456,(null,employeeDetails(John,IN,CS)))

This RDD is being created from a Join with two files.

Now, my requirement is to convert this JSON format to text file format without any "Null" and any case class name(here 'employeeDetails').

My desired output is =

123,Smith,NY,DW
456,John,IN,CS

I have tried with String Interpolation for the same but with partial success.

val textOp = JoinOp.map{jm => s"${jm._1},${jm._2._2}"}

if I print textOp then it will give me below output.

123,employeeDetails(Smith,NY,DW)
456,employeeDetails(John,IN,CS)

Now if I try to access nested elements in "employeeDetails" case class with String interpolation, it will throwing error like below.

JoinOp.map{jm => s"${jm._1},${jm._2._2._1}"}.foreach(println)

<console> :23: Error : value _1 is not member of jm

Here I can understand that, with the above syntax, it's unable to access nested element for "employeeDetails" case class.

What might be the solution for this issue. Any help or point forward would be of much help.

Many Thanks, Pralay

Upvotes: 0

Views: 535

Answers (3)

Shyamendra Solanki
Shyamendra Solanki

Reputation: 8851

If you just need to print all fields of case class, you may use productIterator to traverse field list.

val textOp = JoinOp.map { jm => 
    s"""${jm._1},${jm._2._2.productIterator.mkString(",")}"""
}

Upvotes: 1

abalcerek
abalcerek

Reputation: 1819

You can do it like this:

case class EmployeeDetails(var0: String, var1: String, var2: String)
val data = List((123,(null, EmployeeDetails("Smith", "NY", "DW"))))

data.map {case (num, (sth, EmployeeDetails(var0, var1, var2))) =>
  s"$num,$var0,$var1,$var2"}

Upvotes: 0

Iulian Dragos
Iulian Dragos

Reputation: 5712

Case classes have field names. So, instead of ._1 you need to use the field name for that position. Assuming the following definition:

case class EmployeeDetails(name: String, state: String)

you would access it

JoinOp.map{jm => s"${jm._1},${jm._2._2.name}"}.foreach(println)

Upvotes: 1

Related Questions