user2200660
user2200660

Reputation: 1281

How to inflate pair RDD on the values?

I use Scala and I want to convert a RDD(String, List[String]) into RDD(String, String) with individual element in the list as a row, e.g.

cat List[2,4]
dog List[6,5,4]

should be converted to

cat 2
cat 4
dog 6
dog 5
dog 4

Upvotes: 1

Views: 145

Answers (1)

Knows Not Much
Knows Not Much

Reputation: 31576

Whenever inflating something 'flatMap' is quite useful

val x = List(("cat", List(2, 4)), ("dog", List(6, 5, 4)))
val rdd = sc.parallelize(x)
val y = rdd.flatMap{ case(x, y) => y.map((x, _))}
y.collect().foreach(println)

output

(cat,2)
(cat,4)
(dog,6)
(dog,5)
(dog,4)

Upvotes: 6

Related Questions