Luca Martinetti
Luca Martinetti

Reputation: 3412

countApproxDistinctByKey in PySpark

I am trying to use countApproxDistinctByKey in pyspark (1.4 and 1.5) but cannot find it.

https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L417

Am I missing something or has not been ported / wrapped yet?

Thanks

Upvotes: 0

Views: 458

Answers (1)

Justin Pihony
Justin Pihony

Reputation: 67075

Nope, hasn't been ported yet. You can only do countApproxDistinct as of 1.5.

Source code for python RDD

Upvotes: 1

Related Questions