Reputation: 93
i have a RDD with a list of floats:
[1.0, 3.0, 4.0, 2.0]
and i want a transformed RDD like this:
[(1.0, 3.0), (1.0, 4.0), (1.0, 2.0), (3.0, 4.0), (3.0, 2.0), (4.0, 2.0)]
Any help is appreciated.
Upvotes: 0
Views: 123
Reputation: 27455
You need RDD.cartesian
.
Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements (a, b) where a is in self and b is in other.
>>> rdd = sc.parallelize([1, 2])
>>> sorted(rdd.cartesian(rdd).collect())
[(1, 1), (1, 2), (2, 1), (2, 2)]
Note that this returns the pairs in both directions. Hopefully this is not a problem for you.
Upvotes: 1