Anaadih.pradeep
Anaadih.pradeep

Reputation: 2593

unexpected Error in reduce

While finding max value with reduce in pyspark i am getting the below unexpected result.

agg.reduce(lambda a,b : a if a > b else b )

and my sample data is

  (u'2013-10-17', 80325.0)
(u'2014-01-01', 68521.0)
(u'2013-11-10', 83691.0)
(u'2013-11-14', 149289.0)
(u'2013-11-18', 94756.0)
(u'2014-01-30', 126171.0)

and result is

(u'2014-07-24', 97088.0)

It should gave result more than 94756

Thanks sPradeep

Upvotes: 0

Views: 28

Answers (2)

user7312909
user7312909

Reputation: 11

Just use max with key:

rdd.max(key=lambda x: x[1])

Upvotes: 1

Mariusz
Mariusz

Reputation: 13936

You should compare the second value in tuple, like this:

agg.reduce(lambda a,b : a if a[1] > b[1] else b )

Upvotes: 1

Related Questions