Notinlist
Notinlist

Reputation: 16640

Iterating on org.apache.spark.sql.Row

I'm using Spark shell (1.3.1) which is a Scala shell. The simplified situation that needs iteration on Row is something like this:

import org.apache.commons.lang.StringEscapeUtils

var result = sqlContext.sql("....")
var rows = result.collect() // Array[org.apache.spark.sql.Row]
var row = rows(0) // org.apache.spark.sql.Row
var line = row.map(cell => StringEscapeUtils.escapeCsv(cell)).mkString(",")
// error: value map is not a member of org.apache.spark.sql.Row
println(line)

My problem is that Row does not have map and - as far as I know - it cannot be converted to Array or List, so I cannot escape each cell using this style. I could write a loop using an index variable but it would be inconvenient. I would like to iterate on the cells in a situation like this:

result.collect().map(row => row.map(cell => StringEscapeUtils.escapeCsv(cell)).mkString(",")).mkString("\n")

(These are typically not large results, they can fit into the client memory many times.)

Is there any way to iterate on the cells of a Row? Is there any syntax for putting an index based loop at the place of row.map(...) in the last snippet?

Upvotes: 4

Views: 4829

Answers (1)

Chris Husband
Chris Husband

Reputation: 19

You can use toSeq() on Row which has map. toSeq will be in the same order as the rows

Upvotes: 1

Related Questions