learning_spark
learning_spark

Reputation: 669

Applying filter on an RDD of Vectors/Array[Double]

Suppose I have an RDD of Array[Double], with n columns. I want to apply a filter on the last column (say, the value > some constant).

Upvotes: 1

Views: 523

Answers (1)

Eugene Zhulenev
Eugene Zhulenev

Reputation: 9734

Something like that

val rdd: RDD[Array[Double]] = ...
val filtered: RDD[Array[Double]] = rdd.filter(arr => arr.last() > some_value)

I don't think that it really matter what to choose Array or Vector. Overall overhead of Spark is much-much higher than performance/memory benefits from Arrays vs Vectors

Upvotes: 2

Related Questions