Leszek Malinowski
Leszek Malinowski

Reputation: 111

How to change RowMatrix into Array in Spark or export it as a CSV?

I've got this code in Scala:

val mat: CoordinateMatrix = new CoordinateMatrix(data)
val rowMatrix: RowMatrix = mat.toRowMatrix()

val svd: SingularValueDecomposition[RowMatrix, Matrix] = rowMatrix.computeSVD(100, computeU = true)

val U: RowMatrix = svd.U // The U factor is a RowMatrix.
val S: Vector = svd.s // The singular values are stored in a local dense vector.
val V: Matrix = svd.V // The V factor is a local dense matrix.

val uArray: Array[Double] = U.toArray // doesn't work, because there is not toArray function in RowMatrix type
val sArray: Array[Double] = S.toArray // works good
val vArray: Array[Double] = V.toArray // works good

How can I change U into uArray or similar type, that could be printed out into CSV file?

Upvotes: 4

Views: 2649

Answers (2)

Leszek Malinowski
Leszek Malinowski

Reputation: 111

It works:

def exportRowMatrix(matrix:RDD[String], fileName: String) = {
  val pw = new PrintWriter(fileName)
  matrix.collect().foreach(line => pw.println(line))

  pw.flush
  pw.close
}

val rdd = U.rows.map( x => x.toArray.mkString(","))
exportRowMatrix(rdd, "U.csv")

Upvotes: 0

eliasah
eliasah

Reputation: 40370

That's a basic operation, here is what you have to do considering that U is a RowMatrix as following :

val U = svd.U

rows() is a RowMatrix method that allows you to get an RDD from your RowMatrix by row.

You'll just need to apply rows on your RowMatrix and map the RDD[Vector] to create an Array that you would concatenate into a string creating an RDD[String].

val rdd = U.rows.map( x => x.toArray.mkString(","))

All you'll have to do now it to save the RDD :

rdd.saveAsTextFile(path)

Upvotes: 3

Related Questions