hatellla
hatellla

Reputation: 5132

Convert Set to individual objects in Scala Spark

I have a RDD in which data is of form (x, y), ExampleObject

So, the class has 2 variables:

  1. tuple consisting of x and y (both are strings)
  2. exampleObject of class ExampleObject

ExampleObject class further contains 2 attributes:

  1. setObjects1 of SetObject1 class type
  2. setObjects2 of SetObject2 class type

Each SetObject1 class further contains 2 attributes:

  1. singleObject of SingleObject class type
  2. setObjects3 of SetObject3 class type

You can assume all of the attributes have their getter associated with it. There is another class SingleTransformedObject to which I want to map the singleObject objects.

Now, what I want to do is read this RDD and get the mapped RDD which contains data of SingleTransformedObject list. How can I do that? Some code for initial stages is like this:

val filteredRDD = inputRDD.filter { case ((x, _), _) => x == "2321"}
  .map {case (key, exampleObject) =>
    exampleObject.getSetObjects1}

Now, after this, I am not sure how can I divide the set of objects to single objects and apply on each of them a transformation.

Could you provide an example?

Upvotes: 0

Views: 124

Answers (1)

Arjan
Arjan

Reputation: 9874

Since exampleObject.getSetObjects1 seems to return a Set (or other Collection), map would result in a RDD<Set<SetObjects1>>. Based on the question I guess you're looking for RDD<SetObjects1>. In that case you need flatMap instead of map.

val filteredRDD = inputRDD
    .filter { case ((x, _), _) => x == "2321" }
    .flatMap { case (key, exampleObject) => exampleObject.getSetObjects1 }
    .map { // code here to convert SetObject to SingleTransformedObject }

Upvotes: 2

Related Questions