Reputation: 81
I am grouping a RDD based on a key.
rdd.groupBy(_.key).partitioner
=> org.apache.spark.HashPartitioner@a
I see that by default Spark, associates HashPartitioner
with this RDD, which is fine by me because I agree that we need some kind of partitioner to bring alike data to one executor. But, later in the program I want the RDD to forget about its partitioner strategy because I want to join it with another RDD which follows different partitioning strategy. How can we remove the partitioner from the RDD?
Upvotes: 0
Views: 183