Reputation: 51
i have a javaPairRDD called "rdd", its tuples defined as:
<Integer,String[]>
i want to extract the highest key using max() function but it requires a Comparator as an argument, would you give me an example how to do it, please !!!
example:
rdd={(22,[ff,dd])(8,[hh,jj])(6,[rr,tt]).....}
after applying rdd.max(....) , it sould give me:
int max_key=22;
help me please...in java please
Upvotes: 0
Views: 666
Reputation: 51
even that @David's answer was so logic it didn't work for me and it always requires a Comparator, and when i used a Comparator it appeared an exception (not serialisable operation, so i tried with Ordering but this time the max-key was 1 (means the min in fact), so finally, i used the easiest way ever, i sorted my pairRDD descendantly then i extracted the first() tuple.
int max-key=rdd.first()._1;
Upvotes: 0
Reputation: 11593
Your approach isn't working because tuples don't have an inherent ordering.
What you're trying to do is get the maximum of the keys. The easiest way to do this would be to extract the keys and then get the max like so
keyRdd = rdd.keys()
max_key = keyRdd.max()
Note: Not a javaSpark user, so the syntax may be a bit off.
Upvotes: 1