Reputation: 391
I have a RDD(Long, util.List[Foo]) which I want to flatten on the list to look like RDD(Long, Foo) and then eventually call a getCode method which is a part of the Foo. Here is my approach so far
val test = source
.filter(x => x.getFooList != null)
.map(x => (x.getFooList, x.getId))
.map{
case(foo, id) => foo.toArray().map(foo => (foo, id))
}
ideally I would like to have the id at the first position
This method works. However to toArray method converts it from Foo to AnyRef. I cannot call the getCode method on AnyRef. What is the best way to do this?
Upvotes: 2
Views: 1374
Reputation: 14227
convert util.List
to scala List
can solve this issue:
import scala.collection.JavaConverters._
...
case(foo, id) => foo.asScala.map(foo => (foo, id))
...
Upvotes: 1
Reputation: 37852
First - if you want to flatten your RDD you'll have to use flatMap
and not map
. Second, if you want the "id" to come first - place it first in the tuple you're building for each item. And third - since your source RDD contains java.util.List
s, you'll have to convert them (can be done implicitly with the right import) to Scala collections:
import scala.collection.JavaConversions._ // import to get implicit conversion
val test: RDD[(Long, Foo)] = source
.filter(x => x.getFooList != null)
.map(x => (x.getFooList, x.getId))
.flatMap { // use flatMap
case (foo, id) => foo.map(f => (id, f)) // switch the order
}
Upvotes: 2