Reputation: 669
Suppose I have an RDD of Array[Double], with n columns. I want to apply a filter on the last column (say, the value > some constant).
Upvotes: 1
Views: 523
Reputation: 9734
Something like that
val rdd: RDD[Array[Double]] = ...
val filtered: RDD[Array[Double]] = rdd.filter(arr => arr.last() > some_value)
I don't think that it really matter what to choose Array or Vector. Overall overhead of Spark is much-much higher than performance/memory benefits from Arrays vs Vectors
Upvotes: 2