Jill Clover
Jill Clover

Reputation: 2328

RDD filter with other function

I know how to filter a RDD like val y = rdd.filter(e => e%2==0), but I do not know how to combine filter with other function like Row.

In val rst = rdd.map(ab => Row(ab.a, ab.b)), I want to filter out ab.b > 0, but I tried put filter at multiple place and they do not work.

Upvotes: 1

Views: 81

Answers (1)

Andrey Tyukin
Andrey Tyukin

Reputation: 44918

I'm not sure about the "out" part in "filter out": do you want to keep those entries, or do you want to get rid of them? If you want to drop all entries with ab.b > 0, then you need

val rst = rdd.filterNot(_.b > 0).map(ab => Row(ab.a, ab.b))

If you want to retain only the entries with ab.b > 0, then try

val rst = rdd.filter(_.b > 0).map(ab => Row(ab.a, ab.b))

The underscore _ is simply the shorter form of

val rst = rdd.filter(ab => ab.b > 0).map(ab => Row(ab.a, ab.b))

Upvotes: 1

Related Questions