jtitusj
jtitusj

Reputation: 3084

Spark: Converting RDD[(Long, Array[Double])] to RDD[(Long, Long, Double)]

I have an RDD with each entry of the format (Long, Array[Double]). For example:

    val A = sc.parallelize( [(0, [5.0, 8.3]), (1, [4.2, 1.2])] )

I want to transform A to the form:

    [(0, 0, 5.0), (0, 1, 8.3), (1, 0, 4.2), (1, 1, 1.2)],

where the second element in the tuple is the index of the value from the array.

Upvotes: 0

Views: 505

Answers (2)

Nyavro
Nyavro

Reputation: 8866

You can do it this way:

A.flatMap {case (v, arr) => arr.zipWithIndex.map {case (a, i) => (v, i, a)}}

Upvotes: 1

Shadowlands
Shadowlands

Reputation: 15074

try this:

A.flatMap { case (first, dbls) => dbls.zipWithIndex.map { case (dbl, ix) => (first, ix.toLong, dbl) } }

Upvotes: 1

Related Questions