Nigel Ng
Nigel Ng

Reputation: 593

Pyspark converting RowMatrix to DataFrame or RDD

I have a square pyspark RowMatrix that looks like this:

>>> row_mat.numRows()
100
>>> row_mat.numCols()
100
>>> row_mat.rows.first()
SparseVector(100, {0: 0.0, 1: 0.0018, 2: 0.1562, 3: 0.0342...})

I would like to run pyspark.ml.feature.PCA, but its fit() method only takes in a DataFrame. Is there a way to convert this RowMatrix into a DataFrame?

Or is there a better way to do it?

Upvotes: 4

Views: 2615

Answers (1)

user6022341
user6022341

Reputation:

Use:

row_mat.rows.map(lambda x: (x, )).toDF()

Upvotes: 6

Related Questions