Reputation: 789
I'm trying to execute this line on a CoordinateMatrix...
test = test.entries.map(lambda (i, j, v): (j, (i, v)))
where the equivalent in Scala seems to work but fails in pyspark. The error I get when the line is executing...
'MatrixEntry' object is not iterable
And confirming that I am working with a CoordinateMatrix...
>>> test = test_coord.entries
>>> test.first()
>>> MatrixEntry(0, 0, 7.0)
Anyone know what might be off?
Upvotes: 1
Views: 229
Reputation: 215117
Suppose test
is a CoordinatedMatrix, then:
test.entries.map(lambda e: (e.j, (e.i, e.value)))
A side note: you can't unpack a tuple in a lambda function. So map(lambda (x, y, z): )
is not going to work in this case even though it doesn't seem to be the reason that fails.
Example:
test = CoordinateMatrix(sc.parallelize([(1,2,3), (4,5,6)]))
test.entries.collect()
# [MatrixEntry(1, 2, 3.0), MatrixEntry(4, 5, 6.0)]
test.entries.map(lambda e: (e.j, (e.i, e.value))).collect()
# [(2L, (1L, 3.0)), (5L, (4L, 6.0))]
Upvotes: 2