Reputation: 16640
I'm using Spark shell (1.3.1) which is a Scala shell. The simplified situation that needs iteration on Row
is something like this:
import org.apache.commons.lang.StringEscapeUtils
var result = sqlContext.sql("....")
var rows = result.collect() // Array[org.apache.spark.sql.Row]
var row = rows(0) // org.apache.spark.sql.Row
var line = row.map(cell => StringEscapeUtils.escapeCsv(cell)).mkString(",")
// error: value map is not a member of org.apache.spark.sql.Row
println(line)
My problem is that Row
does not have map
and - as far as I know - it cannot be converted to Array
or List
, so I cannot escape each cell using this style. I could write a loop using an index variable but it would be inconvenient. I would like to iterate on the cells in a situation like this:
result.collect().map(row => row.map(cell => StringEscapeUtils.escapeCsv(cell)).mkString(",")).mkString("\n")
(These are typically not large results, they can fit into the client memory many times.)
Is there any way to iterate on the cells of a Row
? Is there any syntax for putting an index based loop at the place of row.map(...)
in the last snippet?
Upvotes: 4
Views: 4829
Reputation: 19
You can use toSeq() on Row which has map. toSeq will be in the same order as the rows
Upvotes: 1