Guforu
Guforu

Reputation: 4023

RowMatrix, MLlib, Java Spark

I have RowMatrix and my question is, how I can manipulate it by indicies? This question is very simillar to this one:

Matrix Operation in Spark MLlib in Java

Finally, everything, what I need is to have the Matrix with the good class library. Currently I can't manipulate this object.

Upvotes: 0

Views: 217

Answers (1)

Till Rohrmann
Till Rohrmann

Reputation: 13346

As the JavaDocs of RowMatrix indicate

:: Experimental :: Represents a row-oriented distributed Matrix with no meaningful row indices.

There is no ordering on the rows. You can obtain a breeze.linalg.DenseMatrix from it by calling toBreeze, but you have no guaranteed ordering of the rows. They are just inserted in the resulting matrix as they arrive at the master. This means that the results of this operation can vary from time to time.

If you need a deterministic outcome of the toBreeze operation, then you have to use an IndexedRowMatrix. There every row has a row index assigned which is used to build the breeze.linalg.DenseMatrix.

From there you can then use the solution proposed here, which is

import no.uib.cipr.matrix.DenseMatrix;
// ...
IndexedRowMatrix U = svd.U();
DenseMatrix U_mtj = new DenseMatrix((int) U.numCols(), (int) U.numRows(), U.toBreeze().toArray$mcD$sp(), true);
// From there, matrix operations are available on U_mtj

Upvotes: 1

Related Questions