samba
samba

Reputation: 3111

Scala - How to convert a pair RDD to an RDD?

I have an RDD[Sale] and wanted to leave only the latest sales. So what I did is created a pair RDD and then performed grouping and filtering:

val sales: RDD[(String, Sale)] = rawSales.map(sale => sale.id -> sale)
      .groupByKey()
      .mapValues(_.maxBy(_.timestamp))

But how do I return back to RDD[Sale] instead of the pair RDD in this case?

The only way I figured out is the following:

val value: RDD[Sale] = sales.map(salePaired => salePaired._2)

Is it the most proper solution?

Upvotes: 0

Views: 716

Answers (1)

Mansoor Baba Shaik
Mansoor Baba Shaik

Reputation: 492

You can access the keys or values from pair RDD directly, like you access any Map

val keys: RDD[String] = sales.keys
val values: RDD[Sale] = sales.values

Upvotes: 1

Related Questions