Reputation: 95
Suppose my mappers output N keys (these keys are different), and I have K reducers. How to write custom Paritioner so that each reducer receive approximately N/K keys? Which keys going to which receives is not important.
Example: Suppose my mappers output 10 pairs <k1,v1>,<k2,v2>,<k3,v3>,...<k10,v10>
, and I have 3 reducers. I want 3 pairs going to 1st Reducer, 3 pairs going to 2nd, 4 pairs going to 3rd, no matter which keys going to which reducers.
What I attempted:
<k1,v1>
to 1st reducer, <k2,v2>
to 2st reducer, and so on. But still there are reducers get much more data than othersk1,k2,...k10
of my mappers changes according to input data --> I have to change code for each input data. Moreover, these keys have equal roles. I just need to distribute them equally between reducers. Thanks a lot.
Upvotes: 1
Views: 47
Reputation: 11609
Default partitioner uses hash function, it gives even distribution by design, so you won't get any better results unless you know something about the data, e.g. exact values of keys that should be distributed.
Upvotes: 0