cdt
cdt

Reputation: 95

Customize Partitioner to balance inputs to reducers

Suppose my mappers output N keys (these keys are different), and I have K reducers. How to write custom Paritioner so that each reducer receive approximately N/K keys? Which keys going to which receives is not important.

Example: Suppose my mappers output 10 pairs <k1,v1>,<k2,v2>,<k3,v3>,...<k10,v10>, and I have 3 reducers. I want 3 pairs going to 1st Reducer, 3 pairs going to 2nd, 4 pairs going to 3rd, no matter which keys going to which reducers.

What I attempted:

Thanks a lot.

Upvotes: 1

Views: 47

Answers (1)

AdamSkywalker
AdamSkywalker

Reputation: 11609

Default partitioner uses hash function, it gives even distribution by design, so you won't get any better results unless you know something about the data, e.g. exact values of keys that should be distributed.

Upvotes: 0

Related Questions