KyBe
KyBe

Reputation: 842

Does a filter function exist which stops when it finds the n'th first element corresponding to a predicate

I ask this question because i had to find one specific element on a RDD[key:Int,Array(Double)] where keys are unique. So it will be costly to use filter on the entire RDD whereas i just need one element which a know the key.

val wantedkey = 94
val res = rdd.filter( x => x._1 == wantedkey )

Thank you for your advices

Upvotes: 1

Views: 137

Answers (2)

gasparms
gasparms

Reputation: 3354

Look the lookup function at PairRDDFunctions.scala.

def lookup(key: K): Seq[V]

Return the list of values in the RDD for key key. This operation is 
done efficiently if the RDD has a known partitioner by only searching 
the partition that the key maps to.

Example

val a = sc.parallelize(List("dog", "tiger", "lion", "cat", "panther", "eagle"), 2)
val b = a.keyBy(x => (_.length)
b.lookup(5)
res0: Seq[String] = WrappedArray(tiger, eagle)

Upvotes: 1

Nikita
Nikita

Reputation: 4515

All transformations are lazy and they are computed only when you call action on them. So you can just write:

val wantedkey = 94
val res = rdd.filter( x => x._1 == wantedkey ).first()

Upvotes: 1

Related Questions