GenericDisplayName
GenericDisplayName

Reputation: 463

How to loop over array and concat elements into one print statement or variable Spark Scala

I am trying to figure out for my application to concatenate the elements of my array as I am looping over them into one variable or a print statement. I need these printed in the stdout in a certain format so another application can use them (an oozie job).

Here is what I have so far the relevant part

filterDF.registerTempTable("filterDF_table")

val filterDF_table_print = spark.sql("""
SELECT SUBSTRING(constraint,locate('(',constraint) + 1,locate(',',constraint) -locate('(',constraint) -1) as error_column,
       SUBSTRING(constraint,1 ,locate('(',constraint) -1) as error_reason
FROM filterDF_table
""")

filterDF_table_print.rdd.map(row => { 
   val row1 = row.getAs[String]("error_reason") 
    val make = if (row1.toLowerCase == "patternmatchconstraint") "Invalid Length" else "error_reason" 
    ("field",row(0),make) }).collect().foreach(println)

Now this is great so far it took me a while to get this far these are all of the elements I need in my printed statement. Just not in the format I am hoping for.

(field,FOO1,Invalid Length)
(field,FOO2,Invalid Length)
(field,FOO3,Invalid Length)
(field,FOO4,Invalid Length)
(field,FOO5,Invalid Length)
(field,FOO6,Invalid Length)
(field,FOO7,Invalid Length)

What I need for my next application to run properly is something like this.

OUTVAR:field,FOO1,Invalid Length
       field,FOO2,Invalid Length
       field,FOO3,Invalid Length  
       field,FOO4,Invalid Length
       field,FOO5,Invalid Length
       field,FOO6,Invalid Length
       field,FOO7,Invalid Length

I am not so worried about the formatting and spacing at this point I can google around for that or ask another question if need be. Mainly I need to get this all into one printed statement to move forward.

Upvotes: 0

Views: 564

Answers (1)

Allen Han
Allen Han

Reputation: 1163

Here is my suggested solution. I don't have the rest of your codebase, so there is no way for me to test it on my own machine, but here is my best attempt:

val res = filterDF_table_print.rdd.map(row => { 
  val row1 = row.getAs[String]("error_reason") 
  val make = if (row1.toLowerCase == "patternmatchconstraint") "Invalid Length" else "error_reason"
  ("field",row(0),make)
}).collect()
val toPrint = res.map{ case (x, y, z) => s"$x, $y, $z" }.mkString("\n")
println(toPrint)

Upvotes: 2

Related Questions